Using the Caselaw Access Project API
The Caselaw Access Project is one of the two best resources for free programmatic access to American caselaw data (along with CourtListener). It has a great, user-friendly website, and thoughtful documentation aimed as several different audiences. And it has a more dramatic story than most legal tech projects, in which archivists at Harvard’s law library cut the spines off of every book in an exhaustive law library collection, digitally scanned them all, but subjected the resulting archive to access restrictions for seven years from the end of the date of the scanning project. (Beware of clicking that link, if like me you jealously guard your monthly allocation of free New York Times articles.)
In the years since the API launched, it’s become significantly more useful with the addition of citation graph data. But it’s also important to recognize the limits on the API’s scope: it only includes cases published in print, and only cases published in bound volumes through 2018, when the scanning project took place. The API also limits public users to 500 API calls per day for most jurisdictions.
I created a Python module called Justopinion with a few utility functions for getting opinions from the Caselaw Access Project API. It’s mostly designed around the use case of downloading a judicial decision with a known citation, getting the text of the opinions in the case, and then downloading any other decisions cited within those opinions.
Here’s an example from Justopinion’s getting started guide that roughly follows that workflow:
from justopinion import CAPClient
client = CAPClient(api_token=CAP_API_KEY)
thornton = client.read_cite("1 Breese 34", full_case=True)
The text that gets passed to the CAPClient.read_cite
method (such as “1 Breese 34”) can be normalized as a recognizable citation thanks to the Eyecite package from the Free Law Project.
thornton.casebody.data.parties[0]
'John Thornton and others, Appellants, v. George Smiley and John Bradshaw, Appellees.'
The case is loaded as a Pydantic model, so any static analysis tools you use on your Python code should understand the data types for each field. The case.law API documentation describes what you should expect the API to deliver.
len(thornton.cites_to)
1
str(thornton.cites_to[0])
'Citation to 15 Ill., 284'
We can see that Thornton v. Smiley cites to only one other case. By passing the citation to the CAPClient.read_cite
method, we can download JSON representing the cited decision and turn it into another instance of the Decision
class.
cited = client.read_cite(thornton.cites_to[0], full_case=True)
str(cited)
'Marsh v. People, 15 Ill. 284 (1853-12-01)'
We can also locate text within an opinion we downloaded, and generate an Anchorpoint selector to reference a passage from the opinion.
thornton.opinions[0].locate_text("The court knows of no power in the administrator")
TextPositionSet{TextPositionSelector[22, 70)}
Of course, Justopinion isn’t necessary for accessing the Case Access Project API from Python. The API’s documentation gives this example of downloading a case using requests, which is a more flexible option but it might involve writing more code in some situations.
response = requests.get(
'https://api.case.law/v1/cases/435800/?full_case=true',
headers={'Authorization': 'Token abcd12345'}
)
Justopinion originated as part of my other Python library AuthoritySpoke, and as of AuthoritySpoke version 0.8, Justopinion is a dependency that gets imported as part of AuthoritySpoke’s setup process. Justopinion is still in an early state, and there are lots of features that could still be added. I decided to use the generic name Justopinion instead of naming the package after the CAP API because I’m considering also adding support for the CourtListener API, and possibly some use cases that don’t depend on an API. If you have any comments or requests about Justopinion, pleases post them at its GitHub repo.