See other comment. It's sound but not decidable.

kevinbuzzard · on June 19, 2021

Basically it turned out that theoretical undecidability did not matter in practice, because Scholze mathematics relies so little on definitional equality. We prove theorems with `simp` not `refl`. Pierre-Marie Pédrot is quoted above as saying that various design decisions are "breaking everything around", but we don't care that our `refl` is slightly broken because it is regarded as quite a low-level tool for the tasks (eg proving theorems of Clausen and Scholze) that we are actually interested in, and I believe our interests contrast quite a lot with the things that Pédrot is interested in.

kmill · on June 19, 2021

And in "Lean style", refl proofs are a bit distasteful from a software engineering point of view because they pierce through the API of a mathematical object. (In the language of OOP, it can break encapsulation.)

It tends to be a good idea to give some definition, then prove a bunch of lemmas that characterize the object, and finally forget the definition.

bryan0 · on June 19, 2021

Can you explain to a non-mathematician how you can prove anything without refl (which I assume is the statement “x=x is true”) ?

digama0 · on June 19, 2021

The jargon is a bit confusing sometimes. In Lean, "refl" does a whole lot more than prove x=x. It is of course available if you want to prove x=x, but the real power of "refl" is that it also proves x=y where x and y are definitionally equal. Or at least that's the idea; it turns out that lean's definitional equality relation is not decidable so sometimes it will fail to work even when x and y are defeq, and this is the theoretically distasteful aspect that came up on the linked Coq issue. In practice, the theoretical undecidability issue never happens, however there is a related problem where depending on what is unfolded a proof by "refl" can take seconds to minutes, and if alternative external proof-checkers don't follow exactly the same unfolding heuristics it can turn a seconds long proof into a thousand-year goose chase. By comparison, methods like "simp" have much more controllable performance because they actually produce a proof term, so they tend to be preferable.

bryan0 · on June 19, 2021

Thanks for the explanation. Is the defeq undecidability a bug of Lean that should be fixed in the future? Or is it an intentional design decision for it to function properly for other types of proofs?

kevinbuzzard · on June 20, 2021

Defeq undecidability is a feature of Lean in the sense that it is a conscious design decision. As we have seen both in this thread and in other places, this design decision puts off some people interested in the foundations of type theory from working with Lean. However it turns out that for people interested other kinds of questions (e.g. the mathematics of Scholze), defeq undecidability is of no relevance at all.

Here's an analogue. It's like saying "Goedel proved that maths was undecidable therefore maths is unusable and you shouldn't prove theorems". The point is that Goedel's observation is of no practical relevance to a modern number theorist and does not provide any kind of obstruction in practice to the kind of things that number theorists think about.

Jweb_Guru · on June 20, 2021

I am quite confident that the developers of lean consider it a feature (or at least, not a bug). Though I'm also not sure why they are building on a complex metatheory like CiC if they are willing to accept undecidable typechecking since you can simplify a lot of things if you give that up.

foooobar · on June 20, 2021

What simplifications do you have in mind?

I think one of the reasons is that Lean's approximation of defeq still works well enough in practice. As mentioned by others, you never really run into the kind of counter examples that break theoretical properties, but instead into examples where checking defeq is inacceptably slow, which remains an issue even when defeq is decidable.

From my understanding CiC is used because it allows using a unified language for proofs, specifications, programs and types, which greatly simplifies the system. For example, in Lean 4, more or less the entire system is built on the same core term language: Proofs, propositions / specifications, programs, tactics (i.e. meta programs), the output of tactics and even Lean 4 itself are all written in that same language, even if they don't always use the same method of computation. In my eyes, for the purposes of writing and checking proofs, CiC is quite complex; but if you want to capture all of the above and have them all checked by the same kernel, CiC seems like a fairly minimal metatheory to do so, even with undecidable type checking.

Jweb_Guru · on June 20, 2021

For example, if you are willing to accept undecidable typechecking, you can define inductive types in a much simpler theory (inductive types and dependent pattern matching are by far the most complex parts of CiC): https://homepage.cs.uiowa.edu/~astump/papers/from-realizabil... .

There are a few other examples but this is the one that immediately sprung to mind. The complexity of CiC is very much about decidable typechecking, it's not fundamental to writing a sound dependently typed proof assistant. And anyone who has actually worked on a kernel for CiC knows that while it is "minimal" compared to what people want to do with it, no efficient kernel is nearly as minimal as something like λP2 with dependent intersections.

foooobar · on June 21, 2021

Thanks! But does that not already answer your question as to why Lean would use CIC and not a simpler metatheory?

That particular paper is from 2016 and the type theory presented there is still lacking a number of features from CIC, with those particular extensions being listed as further research. Work on Lean started in 2013 and the kernel in particular has not been touched in a long time for good reason. Around that time, CIC was already battle-proven with there being popular and well-tested reference implementations to learn from. Since then, all the fancy work has been done on the system surrounding the kernel.

I believe one of the goals at the start was that the kernel should be sufficiently simple and small so that it is easy to ensure that the kernel is correct. Despite the apparent redundant complexity that you suggest, the implementation still seems to be sufficiently simple to reach this goal, as no proof of false that was a result of a kernel bug has been reported yet. CIC being well-tested surely also contributed to that.

Another reason is that, as I understand it, it was not always obvious that Lean would break decidability of defeq. Afaik proof irrelevance was only baked in with Lean 3, while it was an option before that. By that point I don't see much of a reason to rewrite an entire kernel that has worked fine so far.

I also vaguely remember that there were arguments related to performance in favor of the current system (e.g. versus W-types), but I don't really understand these things well enough to make that point with confidence :)

kmill · on June 20, 2021

How are the ergonomics of a proof assistant based on λP2? Lean does a lot of elaboration of metavariables and typeclass instances and such before handing things off for checking by the kernel -- I'm curious about whether λP2's undecidability is decidable enough, so to speak.

Jweb_Guru · on June 20, 2021

That's currently an open research question, but the answer most likely is that the ergonomics could be made just as good, considering that elaboration of metavariables, typeclass instances, etc. already uses possibly-undecidable algorithms distinct from those used in the kernel in just about every proof assistant I'm familiar with (Lean included).

In terms of stuff that do go more or less directly through the kernel, some followup papers to the one I linked discuss how to translate inductive definitions etc. automatically into a form accepted by the simpler kernel, so I am even more confident that this would not present a problem.

kevinbuzzard · on June 20, 2021

The fact remains though that Lean 3 worked well enough to do Scholze level mathematics. I never quite know what to make of people saying "if the designers had done it in some other way, it would have been better". Of course if someone comes along doing it another way, and then they go on to also achieve Scholze level mathematics, then I will be all ears.

Jweb_Guru · on June 21, 2021

That's a very strange line of argument. Many great programs have been written in C. That doesn't mean that there is no way to improve on C. Perhaps to even the playing field, this hypothetical competitor should be given the resources of Microsoft Research as well :)

foooobar · on June 21, 2021

I think Kevin is asserting that to make strong comparative claims about the quality of either system, it's good to have a large and functionally similar body of work to compare the systems with, not that there is no way to improve on Lean's core :)

MSR funds a single developer. The Lean community consists of at least 30 or so active contributors right now, with many more having contributed in the past. Most folks work in academia. Much of Lean's success is absolutely due to the work of that single developer, but I don't think that the playing field is uneven due to MSR's funding.