Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Catala – Law to Code (catala-lang.org)
78 points by Grognak 11 hours ago | hide | past | favorite | 45 comments




Obviously it would be great if this caught on, but it's not even widely understood/agreed on that read-time precision is a desirable quality in a legal system. This is something almost everyone here takes for granted; we want the interpreter or machine to give the same result for the same input. We want that property so we can know the run-time behavior during development.

There are judges and politicians in the US that advocate for various "interpretations" of laws including parts of the constitution, which are different from what the law literally says. In fact they refer to the literal meaning as the "literal interpretation", implying it is one of many valid interpretations, and casting doubt on the idea of language having a precise meaning. The crowd here knows that it is totally possible and often invaluable to work in languages with precise meaning. Anyways, in practice this means: all the steps happened for the law to get passed by the legislature including arguing about the exact text, and instead of enforcing it as written, the judiciary enforces some slightly different but similar law.

A technology like this necessarily concentrates power in the legislature, and takes it away from the judicial system. It concentrates legal power at write time and removes it from run/read time.


Catala is specifically for tax codes and other laws that involve formulas and calculations, not all laws, so I don’t think most of your concerns apply to it specifically. There are often complicated rules governing how, e.g. benefits or tax credits are calculated that natural language is clumsy at expressing, so having a formal language that encodes that logic seems useful.

I agree government/justice by algorithm would be very dangerous, but Catala does not seem to be that.


It’s also the case that the massive set of constantly evolving case law is akin to the most convoluted and buggy “libc” ever implemented, running on a system where random bit flips occur frequently. Any lawyer who says they know how to definitively encode an assumption is inherently making a probabilistic statement colored by their own experience and definitionally limited exposure to case law - it may be near perfect, but it exists in an imperfect runtime environment.

This doesn’t mean that this isn’t a useful tool as an aid for interpretability. And perhaps we can reach a point where ambiguity in case law can “propagate” through a graph of nodes to give a range of answers to any question about a regulation - perhaps with the aid of LLMs. But until we have such a system, it can be dangerous to draw conclusions from systems like this one.

(Not a lawyer, this is not legal advice.)


> "interpretations"... which are different from what the law literally says.

We have to remember that the letter and spirit of the law can grow apart over time, and loopholes are often gamed before that naturally happens anyway. So obviously we still need judges to keep the "spiritual" aspect of intent alive, so that evil isn't laundered through technicality.

"Literal" should really be a concrete thing, but it does feel strangely connected to a problem that has existed since Sola Scriptura, up to Gödel's theorem. I think about this everytime software and law collide. That article on "what color are your bits"[1] also comes to mind.

[1]: https://ansuz.sooke.bc.ca/entry/23


Curiously enough, law in the US, which inherits from Common Law, is heavily focused on the interpretation (the case law) in the courts of the written law, as opposed to the written words themselves. This is in contract with civil law, napoleonic law and Japanese law, which places greater importance on the written words themselves.

I have a potentially more optimistic (and simultaneously more pessimistic!) view to offer.

Some differing interpretations of the law distinguish between the lawmakers' intention vs the literal meaning (and keep in mind that language itself changes a lot in just a few centuries. The hard problem is that, in PL terms, the law is written in syntax without agreed upon semantics. So a decent step could be just using some agreed upon semantics, like we do in code! Then at least "interpreting" it would be unambiguous.

Maybe a decent analogy would be gcc vs clang might produce different programs for certain undefined behavior, and different combinations of pieces might lead to different behavior too (like race conditions), and somebody (the plaintiff/user) is asking you (the judge/compiler) to decide what's going to happen in this next loop/program/whatever.

Or maybe a decent analogy would be getting a ticket that the API is erroring in some rare user's case and having to look into the code and stacktrace to realize it's some weird unanticipated interaction between two different pieces of legacy code (150 year old law) that now interact due to a recent merge (a new law from last year), and now it's crashing, so we have to figure out how to interpret/compile/resolve this user's case.

If law was usable like code, we'd never have any of those issues, just like we never have those issues with actual literal programs. And when we do, it's just because we're using the wrong language/aren't encoding enough things in the types and semantics/shouldn't have used this niche compiler so now let's get a new interpretation from another Supreme Compiler/etc. Life would be easier \s

So it's maybe more optimistic than you, in that the run/read time power (judicial) doesn't get diminished, but more pessimistic in that I believe it because I believe that changing the language from english law jargon to some formal language doesn't actually eliminate the issues it might be intended to eliminate.


How strange to give it the same name as an unrelated natural language spoken by millions of people: https://en.wikipedia.org/wiki/Catala

Not even unrelated, Catala (the law-language) seems to be a French project, supported by institutions in France, and Catalan seems to have a intertwined history with France: https://en.wikipedia.org/wiki/Catalan_language#France

funnily enough, the relation comes from a french jurist's last name

from their repo: https://github.com/CatalaLang/catala

> The language is named after Pierre Catala, a professor of law who pionneered the French legaltech by creating a computer database of law cases, Juris-Data.


Name that is almost certainly related to Catalonia as well.

Catala != Catalan

Try clicking my "Catala" link and looking at the first sentence. The Catalan word for Catalan is català.

Well, not in Catalan… (It is "català")

We really need this in India. There are 53 million cases which are pending in courts, with over 180k cases open for more than 30 years (see https://en.wikipedia.org/wiki/Pendency_of_court_cases_in_Ind...). It is estimated that more than 300 years will be taken to dispose of all cases.

If law code is a repository: 1. Each trial should be encoded into a law. 2. If the trial is already covered sufficiently in the codebase, and both parties agree to it result. Then case is solved. 3. If not, the new judgement leads to a "pull request" into the codebase.


Can't wait for lawmakers to get a red CI before merging ;)

Jokes aside, I'm trying to imagine what a pull request workflow would be for law making.

For example, there might be a test that checks that a law has adequate budget before applying it; or to get an error if it conflicts with another law.

Also (Italian here), I would be very happy to do "git blame" and discover who actually introduced a modification.


Serious question:

Is there any money to be made with this yet? (Jobs, Contracting, Projects, etc)

If not: What's the plan to get this to be used?


Past discussions:

1. (https://news.ycombinator.com/item?id=27059899) - May 2021 (126 comments) 2. (https://news.ycombinator.com/item?id=28633122) - Sept 2021 (40 comments) 3. (https://news.ycombinator.com/item?id=37546874) - Sept 2023 (277 comments)


Catala is a fantastic project and a real attempt to bring computer science and law together. Which is not easy! That said, for practical projects in legaltech a modern pure Prolog system has a lot of useful properties. A project that attempts to use Scryer Prolog for this is VATmiral.

I suspect this works better outside of common law legislation.

I wonder what people who speak the actual Catala have to say about this semantic appropriation. It would be very easy for creators just to google it first.

Huh I just finished a book by Jaron Lanier that described a hypothetical system literally just like this. Always fun to get a coincidence like this

Reminds me of smart contracts

> The aim is not to formalise or put into code all the law, because that would make no sense, but we are interested in the law that is already executed automatically, such as the calculation of social benefits, tax or unemployment.

Can anyone explain why it's believed this "would make no sense"?


Law isn't written to cover 100% of real life scenarios and potential cases, it's written with deliberate parts of ambiguity, that will ultimately be up to courts to set the precedents for, in various situations and context.

I think the idea is that you can't really cover 100% of real-life cases in "code", either legal or software, so the areas you'll leave this out of would be those "not-entirely-strict" parts.


The same can be said about driving but self-driving cars exist.

So is the "bitter lesson" that fuzzy overlords will be practically preferable to hand coded legislation?

It does stand to reason that all law could be formalized. For example, consider the definition of murder in the first degree from 18 USC § 1111:

"Murder is the unlawful killing of a human being with malice aforethought."

You might say, well, "unlawful" and "malice" are fuzzy concepts; but we can take them to be facts that we input into the model. I guess we could write something like this in Catala:

    scope Murder :
      definition in_the_1st_degree
        under condition is_malice_aforethought and is_unlawful consequence
      equals
        true
In the calculation of social benefits and taxes, the facts input to the model are generally things like prices, depreciations, costs, areas of offices, percentages and so on, input numerically and sworn to be true. These numbers are then used to calculate an amount due (or in arrears). Performing the calculation in a way that is verified to conform to the law is a big part of the work.

However, in other areas of law, determining the facts is actually where the real work is -- was there malice aforethought? A formalized legal machine could process these facts but it's not a big help. The models would just be a huge list of assumptions that have to be input and a minimal calculation that produces `true` or one of the alternatives of an enum.


A computer program takes digital bytes, runs some discrete logic on them, outputs some more bytes. Laws take messy real world stuff, run some subjective decision tree on them, and output some messy real world actions to take. If you model the former with the latter you end up 'shelling out' to human judgement every 2 words. Suppose you accidentally shoot somebody while duck hunting, the meaning or value of pretty much everything here can't be determined by a computer, so the code-law version of this random snippet of natural-language-law would be pretty useless:

> If it is found that the defendant did the killing or wounding, but that it was not intentional or negligent, the court shall dismiss the proceeding. Otherwise, if it is found that the defendant did the killing or wounding intentionally, by an act of gross negligence, or while under the influence of alcohol, the court shall issue an order permanently prohibiting the defendant from taking any bird or mammal.


Have you been around on the internet when they discovered AI LLMs ?

A lot of law is based around subjective gray lines. “How would a reasonable person behave in this unique situation?” Is at the root of a lot of legal situations.

Write a function for that, keeping in mind that “this situation” needs to be modeled with potentially infinite variables. Then try to define a “reasonable person”.

Hell, the reason most trials happen is because there is huge grey area, and the written laws are not obvious as to what the outcome should be.


Commercial law nuts and bolts is very algorithmic often.

Criminal law is often fundamentally subjective, incorporating questions of intent and remorse.


I assume it would fail to compile, or error out, because of myriad conflicts throughout the body of laws.

I think the primary reason is that laws are about human convention, not real objects which one can clearly and deliberately define. Like at the most basic level nothing exists at all except for quantum fields or something like that. Everything else we talk about on a regular basis, people, dogs, streets, businesses, etc, is defined by convention to a greater or lesser degree.

It is therefore quite hard to create a formal system to refer to objects in the world in a way which induces no contradictions with intuition. This is why we have courts, among other functions of government.


Basically, all human knowledge is an application of either math or philosophy, and law is philosophy, so cant be modeled by math

> Basically, all human knowledge is an application of either math or philosophy

Philosophy is not knowledge, it's pure speculation.

> law is philosophy, so cant be modeled by math

Law is not philosophy unless it was written based on sloppy speculations. In other words, what law is, depends on how it was written, it can certainly be modeled by logic and math methods can be developed for it too.

It's nothing new, lawyers have to master logic as part of their training.


Modelling intent, with math, is not going to happy. Law is based around the intent of those taking actions, and understanding intent is absolutely philosophy.

Understanding intent is understanding interest and that's not philosophy. If it's not about interest, it's psychiatry - not philosophy either.

Besides, only a lesser part of law is about intent, the major part is about punishing and avoiding harm, finding the true facts and applying the written law to them.

Down-voting can't change the truth, we've been led by the nose for far too long.


To avoid harm, you must identify intent.

To punish, you must establish intent.

Intent has been the core underiding feature of the law since the Magna Carta. To ignore or trivialise it is nothing short of advocating for the return of kings.


> Intent has been the core underiding feature of the law since the Magna Carta.

I've already explained that intent is another word for interest - material or political, it may not be as trivial as potato chips but it's far simpler than rocket science.

> To ignore or trivialise it is nothing short of advocating for the return of kings.

Another purely speculative assertion with zero meaning or practical value.

There's no logical path from trivializing your occultist and unknowable notion if intent to the return of kings. First, you've got to start with a proof that at present there aren't any kings... but philosophy's got no proofs.

Speaking of kinks (sic), wasn't Epstein one of them? Or at least under their protection... until he wasn't, as usual.


If intent were so simply explained, then the High Courts across the world would serve no function - as interpreting intent is their core role.

Material interest and intent only accidentally collide. Intent cannot be defined in that manner.

Almost every person beneath a capitalist system has a material interest in wealth. That does not translate to intent to seize it.

If intent does not matter, only interest, then there is no war crime in bombing boats. There is no arguing with the government's interpretations of law, as they will have a vested interest as to how it plays out.

The "test of intent" is not a part of law to be so offhandly thrown aside.


  scope QualifiedEmployeeDiscount :

    definition qualified_employee_discount

      under condition is_property consequence

    equals

      if employee_discount >=

        customer_price \* gross_profit_percentage

      then customer_price \* gross_profit_percentage

      else employee_discount

It feels like the best of both worlds, a syntax that is new and strange to use while basically being the same old abc If Else programming language.

Not sure I'm seeing any law-specific features either. Maybe if there were some tokens like 'jurisdiction' or 'jurisprudence', but it seems like yet another programming language.


How does this incorporate case law?

That's not so important in Napoleonic/Civil jurisdictions like France. Judges can consider prior rulings, but the law as-written is the main thing.

How's that account for language drift over centuries?

Napoleon's only been gone for about two hundred years, whereas Common Law has some real classics. For example https://en.wikipedia.org/wiki/Statute_of_Merton was a set of laws promulgated in 1235, some of which remained in force (at least nominally) until the 1980s. I don't know much about Canon Law, but that surely goes back even further.

All that to say, you can just do your best to understand the law in the the context in which it was written, and replace the text every now and again.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: