As of late 2020, there are numerous software tools for formalizing and automating legal rules, but there’s not much of a standardized or accepted way to compare their abilities to one another. It’s one thing to see how a legal automation tool works on a problem set devised by the tool’s author, but that doesn’t provide strong evidence that the tool can model a significant share of the legal concepts found in real sources of law like contracts, legislation, and court decisions.

If legal automation tools were tested against a standard problem set that was representative of a reasonable variety of legal rules, it would become much easier to understand how broadly useful each tool is, and it might even reveal that there are fundamental problems in legal rule formalization that haven’t been solved by anybody. There wouldn’t be any guarantee that a standard problem set covered every structure of rule that could be found in a real legal system, but it would at least function as a sanity check to indicate that the tool is powerful enough to model some laws found in real life.

A Standard Dataset for Statute Automation

My favorite candidate for such a standard problem set is the Beard Tax Act test document from the New Zealand Government Service Innovation Lab. This document was apparently created as part of New Zealand’s Better Rules initiative, which I believe has now wound down. The Beard Tax Act document itself is a PDF, which the Innovation Lab started to use for a sort of bake-off of competing legal rule automation methods, at the same time as they moved forward with a larger “Rules as Code” law formalization effort using the OpenFisca Python framework. The Beard Tax Act document doesn’t name its author, and I wasn’t able to verify who wrote it, but it may have been the work of Merrin Macleod, who uploaded it to GitHub.

I think the Beard Tax Act does a great job of presenting, in a relatively brief document, a good sample of legal rule structures that would be found in real legislation. This is an example of fake data that’s better for testing than real data, because the corresponding real provisions would usually be more verbose and farther apart. The Beard Tax Act includes ambiguous language, complex delegations of administrative authority, and provisions with implied meaning that deviates from their literal meaning. All of these features are realistic. The Beard Tax Act also includes a good variety of realistic formatting quirks, such as unintuitive part and section numbering. But while the Beard Tax Act makes great test data, the Government Service Innovation Lab provided very little documentation about exactly what functionality needs to be implemented for this data. And I don’t think it would be obvious to most people how to pinpoint all the useable legal rules found in this text. The rest of this blog post will try to fill in that missing information.

A Rubric for Implementations of the Beard Tax Act

I’m going to try to describe exactly what information would need to be available to a legal automation tool, to fully implement the meanings of each provision of the Beard Tax Act. This could be considered a specification that legal rule data schemas and their corresponding automation tools should try to meet, or maybe just a rubric that they should try to score as high on as possible. In imagining how to model these rules, keep in mind that legal authority is a system that is always in a state of growth but is never complete, so that the ability to exclude unavailable information is almost as critical as the ability to include known information. Also, an important concept underlying this rubric is the idea that “obligations” are not very significant by themselves unless specific performance of the obligations is legally enforceable. Otherwise, what matters is a description of the legal consequences for complying or not complying with the obligation.

The Beard Tax Act is about 1,200 words long, so instead of including it all in this blog post, I’ll suggest you follow along in the PDF version. (And star the repository!)

Section 1, “Short title”

  • The name of the Act should be linked to the entire Act so that it’s discoverable from any section of the Act.
  • The name of the Act should be discoverable from any codified section of the Act, even if the sections are codified in different volumes of the legislative code (for instance, if section 13 is placed in the tax code).
  • The name of the Act should still be discoverable from codified sections of the Act, even if the codified provisions are subsequently amended by other Acts, as long as the codified provisions continue to contain some text from the Beard Tax Act.

Section 2, “Commencement”

  • The commencement date (which is probably the same as the operative date) should be linked to every other section of the Act.
  • The model should allow the operative date to differ from the date the provision was signed into law.
  • The model should allow the operative date to differ from the effective date.

Section 3, “Purpose”

  • This legislative purpose should be accessible from every other section of the Act.

Section 4, “Beard, defined”

Out of the 13 sections of the Beard Tax Act, the readme file provided by the NZ Government Service Innovation Lab describes the desired implementation only for the definition in Section 4. The readme suggests the following decision tree as an interpretation:

  • The implementation of this definition should match the decision tree.

Section 5, “Prohibition of beards”

  • The exception in Section 6 should be included in the calculation of whether the prohibition has been violated (the calculation should be able to access the relevant rules outside Section 5).
  • This prohibition should be referenced in determining whether a criminal offense has occurred in Section 7.
  • The geographic scope of the prohibition should be included in the model.

Section 6, “Exemption”

  • The definition of the exception to the beard prohibition should reference the Department’s administrative power to create exemptions.
  • The rule model should be able to specify the time period of an administrative exemption.
  • The rule model should be capable of expressing whether the 12-month limitation on the exemption power prevents a defendant from relying on exemptions granted for longer periods.

Section 6A, “Levy of beard tax”

  • The obligation to pay a beard tax should be linked to the procedural consequences of nonpayment of the beard tax.

Section 6B, “Regulatory power of the Minister for Beards”

  • The rule model should specify that it is discretionary, not obligatory, that the Minister “may” impose levies.
  • The rule model should be able to specify how the Minister’s exercise of discretion to impose a levy impacts the obligation to pay the levy or the procedural consequences of nonpayment.
  • The rule model should be able to assign procedural significance (if any) to the “good governance and financial stability” standard.
  • The rule model should be able to describe the vague authority of the Minister to create regulations “not limited to” beard levies for “good governance and financial stability”.

Section 6C, “Issuance of beardcoin”

  • The rule model should specify that it is obligatory, not discretionary, that the Department “shall” issue a beardcoin.
  • The rule model specify the beardcoin’s role as evidence used to establish a fact in litigation.

Section 6D, “Waiver of beard tax in special circumstances”

  • The rule model should describe the power of the Department of Beards, as an administrative agency, to establish a fact in litigation (as to whether a beard is worn due to bona fide religious or cultural reasons).
  • The rule model should describe the concept that a finding of fact is “final and no right of appeal shall exist”. It raises the question of whether the barred “appeal” is an administrative appeal to a different agency, or a court action, or both.

Section 7, “Wearing of a beard without exemption”

  • The rule model’s concept of this “offense” should be linked both to the factual predicates in sections 5 and 6, and to the remedy/penalty provisions in sections 8 and 9.

Section 7A, “Improper transfer of beardcoin”

  • The rule should have a way to refer to “Part 4” as the place where defenses to the charged crime can be found. “Part 4” could be thought of as a container that currently includes sections 10, 11, and 12, but other sections might be added to Part 4 in the future.

Section 7B, “Counterfeit beardcoin”

  • The rule model should be able to express the idea that 7B expands the application of the offense in 7A (even though 7B purports to be about a defense to the offense in 7A). While 7A prohibits transferring “a beardcoin”, 7B adds a prohibition against transferring “a counterfeit beardcoin”.

Section 8, “Notice to remedy”

  • The rule model should be able to take a position on whether the issuance of a notice to remedy is a precondition to a conviction of the offense of unlawfully wearing a beard.
  • The rule model should be able to take a position on whether the defendant’s noncompliance with the notice to remedy is a precondition to a conviction (or whether noncompliance is only a precondition to a penalty for the conviction, under Section 9(3)).
  • The rule model should be able to avoid taking a position on the two questions above, if those issues haven’t been definitively decided yet.

Section 9, “Penalties”

  • The rule model should indicate a concept of “penalties” as a subset of “remedies”, representing acts that a court can take because it has found that the elements of a cause of action were established.
  • The rule model should be able to handle facts that aren’t needed to support a conviction, but that do determine aspects of the penalty for the conviction (in this case, whether the defendant has been convicted of a prior offense).
  • The rule model should describe the judge’s discretion to impose a fine, imprisonment, or both.

Section 10, “Purpose of Part 4”

  • This “purpose” provision should be linked to Part 4, so that the affected sections are a subset of the sections affected by the purpose provision in Section 3.

Section 11, “Licensed repurchasers of beardcoin”

  • This section again describes how the Department of Beards can grant a legal privilege that can be used as a defense to a criminal prosecution, but unlike Section 6C, the model will have to capture the fact that the person who has received a privilege from the Department is not the same person who is in jeopardy of prosecution.

Section 12, “Rate to be paid to repurchasers of beardcoin”

  • The rule model should be compatible with a description of the method for enforcing the payments from the Department (if any), even though no such description is given here, so that the description could be added later without needing to refactor the existing rule model.

Section 13, “Removal of GST from razors and shavers”

  • The rule model should make it possible for a section of this Act to modify the application of a provision of a different Act that was enacted at a different time.

Conclusion and Challenge

That’s 36 bullet points. So if we turned this rubric into a 36-point grading scale, are there any existing computational law systems that could score well on it? Do you have any suggestions for correcting or improving this rubric? Share your ideas on Twitter.


For another viewpoint on the requirements for a good legal rule model, check out this paper by Thomas Gordon et al.: “Rules and Norms: Requirements for Rule Interchange Languages in the Legal Domain”. It’s more abstract and it doesn’t specify how a rule model should perform on a particular dataset, but it’s more formal and it evaluates several data models that existed when the article was written in 2009.