A Python Package for Legal Case Based Reasoning

In June I was lucky enough to be sent to Braga, Portugal to represent Cornell Legal Information Institute at the 19th ICAIL conference (International Conference on Artificial Intelligence and Law). The core of this conference is an academic community rooted in knowledge-heavy AI approaches, many of them with lineage extending back at least to the 1980s. I’ve been reading work by many of these researchers for many years, so it was great to meet them in person and hear what they’ve been up to lately.

Only a few of the research projects presented had code available to run in Python. Out of those that did, I think the standout was a Legal Case-Based Reasoning system presented by Daphne Odekerken of Utrecht University. The paper was coauthored with Floris Bex and Henry Prakken, and it also cited inspiration from formal logic models of precedent developed by John Horty.

The Python package, which is just named with the initials LCBR, lets users define a legal issue and a list of factors that contribute to determining the issue’s outcome. A factor’s value can be set as a boolean, or as a number in a range. For instance, the demo application is about whether a retail sales website should be investigated as fraudulent. Boolean factors include whether the site has a terms and conditions page, and whether it has a non-functioning payment link. Numeric factors include the number of days the page has been online. If I understand right, setting up a user application with the package requires identifying the full list of factors that can be used to decide the legal issue, and also requires labeling each of the factors and “pro” or “contra”. For instance, the number of days online would be a “contra” factor where the website becomes less suspicious the longer it’s been online. I don’t think there’s any way to say that a “pro” factor can become a “contra” factor in the presence of specific other factors, so the system would only work when every factor has a certain polarity that never changes.

One great feature of LCBR is that it might not require you to have all the possible information about a particular case before you can get an answer about the outcome. If there’s no possible factor that will distinguish your case from other cases that reached a certain outcome, then the system can reach a stable conclusion that the same outcome will be reached even if new information is added later. For example, if there’s no way the facts of the current case can turn out to be more favorable to the defendant on any dimension compared to the facts of a prior case that reached a decision against a defendant, then the system reasons that the current case will go against the defendant as well. As that example suggests, LCBR does assume that all the cases in the database are consistent with one another. If not enough factors have been determined to reach a stable conclusion, LCBR also has an algorithm to determine which other factors are most relevant to decide the issue. This seems similar to the function that Docassemble uses to determine which question to present to the user next during a guided interview.

The LCBR package has a UI that can be spun up using Flask and Dash, but I wouldn’t exactly call it user-friendly because it seems to assume the user has a lot of knowledge about the theoretical concepts that inform the case-based reasoning algorithm. However, a version of the same tool was used to create a user-facing intake system for the National Police AI Lab of the Netherlands.

I’d love to see a system like LCBR expanded beyond its current limitations. When creating a model of a legal issues, I’d prefer not to have to list all factors that can bear on a determination in advance, because sometimes newly-added cases will also identify new factors. I’d also prefer not to have to identify the polarity of every factor in advance. Even the paper introducing LCBR concedes that the assumption of a consistent case base is “quite a strong assumption.” Instead of making that assumption, I’d like to use a system that can provide rules of precedent that start with an inconsistent set of cases, and then show how certain cases can be overruled or disregarded until all the cases that remain are consistent. (But when I suggest features like that, I’m thinking of what would be useful for simulating common law jurisprudence, which probably wouldn’t meet the needs of government agencies in the Netherlands.) And finally, instead of having models of “cases” covering only one legal issue, it would be nice to model a collection of rules showing ways to reach multiple legal conclusions, some of which are factors potentially supporting further conclusions. That would allow the system to evolve beyond simple one-step legal determinations such as whether to open a fraud investigation, so that the system could model more complex processes such as litigation.