It's neutered and not as full featured, but not bad in a pinch. All of these web admin tools are hacks that call out to shell scripts and whatnot. It requires a lot of conditional behavior and/or vertical integration. "Linux" has no consistent API for control, so its all duct tape. Webmin is the same, tbh (swap perl for whatever cockpit is written in)
Autonomous vehicles following proper signalling before lane changes can be safe at arbitrary speeds (see Autobahns working at all). Humans, we should limit passing speed to roughly ~5 mph delta between adjacent lanes and leave it at that.
Humans with adequate following distance in the entire lane can probably manage 10 mph delta. I routinely travel dozens of miles very safely at ~80 with the flow of traffic (including the cops), and been stressed out at 55 in the carpool lane through stop and go traffic in the right-hand lanes due to on ramps/offramps.
I think 75 is memorable and roughly in the region where the tradeoff between increased kinetic energy and decreased time to arrival per additional unit of velocity becomes untenable.
Framing LLM use for dev tasks as "narrative" is powerful.
If you want specific, empirical, targeted advice or work from an LLM, you have to frame the conversation correctly. "You are a tenured Computer Science professor agent being consulted on a data structure problem" goes a very long way.
Similarly, context window length and prior progress exerts significant pressure on how an LLM frames its work. At some point (often around 200k-400k tokens in), they seem to reach a "we're in the conclusion of this narrative" point and will sometimes do crazy stuff to reach whatever real or perceived goal there is.
Presumably the idea is that an agent generates a Lean4 specification against which the software is measured.
But then the Lean4 specification effectively becomes the software artifact.
And we're sort of back to square 1. How do you verify a Lean4 spec is correct (and that it describes what needs to be built in the first place) without human review?
Specifications are smaller than the full code, just as high level code is smaller than the functionally equivalent assembly. As we ascend the abstraction ladder the amount of reading a human needs to do decreases. I don't think this should really count as "back to square 1".
That has always been the perceived promise of higher-abstraction software specs: automated code generation from something higher-level, thus making programmers increasingly obsolete.
Among the RADs is Microsoft Visual Basic, which along with WinForms and SQL was supposed to make business programmers nearly obsolete, but instead became a new onramp into programming.
In particular, I'd like to highlight UML, which was supposed to mostly obsolete programming through auto-generated code from object-oriented class diagrams.[1] The promise was that "business domain experts" could model their domain via visual UML tooling, and the codegen would handle it from there. In practice, UML-built applications became maintenance nightmares.
In every one of these examples, the artifact that people made "instead of programming" became the de-facto programming language, needing to be maintained over time, abstracted, updated, consumed behind APIs, etc. -- and programmers had to be called in to manage the mess.
It's interesting that Spec4 can be auto-generated, then used to generate code. My question is - what do you do when you have (a) consumers depending on a stable API, and (b) requests for new features? Maybe hand the job to Claude Code or a human developer with a suite of unit tests to guarantee API compatibility, but at that point we're back to an agent (LLM or human) doing the work of programming, with the Spec4 code as the programming language being updated and maintained.
Also a very good question btw, people do both. For some projects Lean is expressive and performant enough to use on its own (or call into using the reverse FFI), other projects use a model of a real programming language like Rust. The disadvantage of the latter is that the Lean model of Rust has to be trusted.
Cedar (https://lean-lang.org/use-cases/cedar/) at AWS use an executable Lean model of a system as an oracle for differential testing of a Rust implementation. If you can't run Lean in production, then their approach is compelling.
The idea is that you start with a Lean specification, create a fully verified implementation with respect to the spec, and then hook it and the production implementation up to a fuzzer or source or random inputs. Then you can explore a lot of the state space fully automatically.
I don't know - if you upload a package.json with any dependencies that map to real npmjs.com packages, it does lead you to a Stripe payment page which appears to be real... and it appears you'd be sending real money.
It's wild to watch documentaries from the 1980s where a primitive computer is said to be "a thinking machine" that is "taking most of the work out of a job".