Legislice: Exploring the Network of Statute Citations

Now that I’ve released version 0.3 of the Legislice Python package, it’s a good time for me to pause and explain how I envision this tool developing in the future.

The original goal of the Legislice package is to share structured data about passages from legislation that have been interpreted in court opinions. That’s why most of the functions of Legislice were originally part of AuthoritySpoke. Generally, to understand the meaning of legal precedent in a court opinion, you need to be able to describe and understand exactly what legislation the court is interpreting. Legislice also might be useful for pinpointing legislation passages in other contexts, like describing potential amendments to a statute, or analyzing laws in a treatise. Part of the idea behind Legislice is that in a web application that lets users browse annotated legislative text, the presentation layer of the application will need to help the users find the phrases in the legislation that have been interpreted before, and the phrases from the text will often overlap. That means the text selectors need to be stored separately from the text, because the application won’t be able to put a separate pair of HTML tags around every substring of text linked to an annotation.

Legislice basically has three parts: a serializer for converting JSON about legislative passages into Python objects and then dumping them back to JSON, a set of methods for comparing the legislative passages to determine whether they contain the same language, and an API client for downloading more legislative provisions.

The JSON schema for sharing this data is somewhat complex, because I wanted to be sure that any citable legislative provision can be used to create a valid Legislice object. So inside a Legislice Enactment representing a statute section, there could be another Legislice Enactment representing a subsection, containing another Legislice Enactment representing a numbered paragraph, and so on. But if an Enactment represents a really large legislative document, like an entire title of the United States Code, it needs to provide links to the provisions it contains instead of incorporating their full text.

The JSON schema is also unusual in including the concept of a “text version” separate from the text content in effect at a citation at a particular time. This concept should make it easier to implement features that follow a particular provision across time, even if that provision has been moved and renumbered. It has already proven useful in implementing the feature that checks how a statute has been cited.

When you have an Enactment object representing a provision, you can access any of its cross-references to other provisions of the United States Code, and can use Legislice’s download client to download the cited provisions. But there’s also a way to use the download client to fetch any other provisions that cite to a known provision, which lets you follow citations backward instead of forward. Instead of directly giving you a collection of Enactment objects, the download client will give you a list of InboundReferences that describe what Enactments you can download. If the text containing the reference that interests you has been renumbered and enacted in more than one location, the InboundReference will indicate that by including all of those locations. As explained in the documentation, if you use the download client to search for provisions citing to /us/usc/t2/s1301, you’ll find that two of those citing provisions have been renumbered twice.

>>> client.citations_to("/us/usc/t2/s1301")
[InboundReference to /us/usc/t2/s1301, from (/us/usc/t2/s4579/a/4/A 2018-05-09) and 2 other locations,
 InboundReference to /us/usc/t2/s1301, from (/us/usc/t2/s4579/a/5/A 2018-05-09) and 2 other locations,
 InboundReference to /us/usc/t2/s1301, from (/us/usc/t42/s2000ff/2/A/iii 2013-07-18),
 InboundReference to /us/usc/t2/s1301, from (/us/usc/t42/s2000ff/2/B/iii 2013-07-18)]

One you have that information, you can obtain Enactment objects either by passing InboundReferences to the download client’s read() method, or by passing the memos of Enactment locations found in the InboundReference.locations attribute.

A warning: data acquisition is a serious challenge in any law publishing project, including Legislice. The United States Code data available is limited to versions of the USC published since 2013. Also, Legislice is only able to supply information about citations from one provision to another if those references exist as links in the XML versions of the USC published by Congress. Unfortunately, Congress was not always consistent in providing link markup for citations, and Legislice doesn’t have any way to discover where links should have been placed if there was no markup. In that sense, the citation data Legislice is able to draw on should be considered incomplete. Also, Legislice only includes citations from one part of the USC to another, not citations to other legislative codes or to other documents.

So, future features for Legislice could revolve around the ideas of annotation and tracing the movement of identical text across locations and time. That could include a function that takes an Enactment and retrieves other locations where the same text has been enacted, or that returns all versions of the text enacted at the same citation at different times. It could include methods for using Legislice objects to index other kinds of annotations, including AuthoritySpoke content. Or it could include a citation parser that can identify USC provisions using other citation formats. Since it’s not likely that I’ll be able to expand my Legislice API server to supply all the kinds of legislative text that everyone needs, I hope that other legislation APIs will either adopt Legislice’s JSON schema, or converge on a standard that Legislice can adopt. Please feel free to raise Github issues or reach out if you have any thoughts about how Legislice can help you in your work.