We're finally getting there. The model of web notebooks look a lot like Hypercard stacks in terms of usability; there's only missing someone packing them in and easy-to-use distribution and sharing environment that does not depend on users installing their own web server.
And if that package includes some reasonable local LLM model, creating simple programs by end users could be even easier than it ever was with Hypercard.
I didn't mean "like hypercard" so literally in this manner. What I meant was, a computing environment that seems to blend seamlessly into the wider operating system, and that is able to sufficiently blur the line between end users and "programmers" (here called "authors"). Critical to this capability was the ability to "pop the hood" easily and mess with what was going on underneath.
All of today's computing is fundamentally based on a strong division between programmers and users. That division has only grown more stark with time. The dominance of Unix is partly to blame, in my view.
The enduse developer experience sees the 17e and Neo paired with spoken instruction AI prompts going to the iPhone that effects the Hypercard network aware environment do the thing on the laptop.
Isn't that the same as compressing the whole book, in a special differential format that compares how the text looks from any given point before and after?
There are many ways to model how the model works in simpler terms. Next-word prediction is useful to characterize how you do inference with the model. Maximizing mutual information, compressing, gradient descent, ... are all useful characterisations of the training process.
But as stated above, next token prediction is a misleading frame for the training process. While the sampling is indeed happening 1 token at a time, due to the training process, much more is going on in the latent space where the model has its internal stream of information.
I can only speak for myself but for me, it's all about the syntax. I am terrible at recalling the exact name of all the functions in a library or parameters in an API, which really slows me down when writing code. I've also explored all kinds of programming languages in different paradigms, which makes it hard to recall the exact syntax of operators (is comparison '=' or '==' in this language? Comments are // or /*? How many parameters does this function take, and in what order...) or control structures. But I'm good at high level programming concepts, so it's easy to say what I want in technical language and let the LLM find the exact syntax and command names for me.
I guess if you specialise in maintaining a code base with a single language and a fixed set of libraries then it becomes easier to remember all the details, but for me it will always be less effort to just search the names for whatever tools I want to include in a program at any point.
I agree with a bunch of this (I'm almost exclusively doing python and bash; bash is the one I can never remember more than the basics of). I will give the caveat that I historically haven't made use of fancy IDEs with easy lookup of function names, so would semi-often be fixing "ugh I got the function name wrong" mistakes.
Similar to how you outlined multi-language vs specialist, I wonder if "full stack" vs "niche" work unspokenly underlies some of the camps of "I just trust the AI" vs "it's not saving me any time".
It is possible to try it, and some people do (high speed trading is just that, plus taking advantage of privileged information that speed provides to react before anyone else).
However there are two fundamental problems to computational predictions. The first one obviously is accuracy. A model is a compressed memorization of everything observed so far; a prediction with it is just projecting into the future the observed patterns. In a chaotic system, that goes only so far; the most regular, predictable patterns are obvious to everybody and give less return, and the chaotic system states where prediction would be more valuable are the less reliable. You cannot build a perfect oracle that would fix that.
The second problem is more insidious. Even if you were able to build a perfect oracle, acting on its predictions would become part of the system itself. That would change the outcomes, making the system behave in a different way as it was trained, and thus less reliable. If several people do it at the same time, there's no way to retrain the model to take into account the new behaviour.
There's the possibility (but not a guarantee) to reach a fixed point, that a Nash equilibrium would appear where such system becomes into a stable cycle, but that's not likely in a changing environment where everybody tries to outdo everyone else.
Ah, this actually connects a few dots for me. It helps explain why models seem to have a natural lifetime, once deployed at scale, they start interacting with and shaping the environment they were trained on. Over time, data distributions, usage patterns, and incentives shift enough that the model no longer functions as the one originally created, even if the weights themselves haven’t changed.
That also makes sense of the common perception that a model feels “decayed” right before a new release. It’s probably not that the model is getting worse, but that expectations and use cases have moved on, people push it into new regimes, and feedback loops expose mismatches between current tasks and what it was originally tuned for.
In that light, releasing a new model isn’t just about incremental improvements in architecture or scale; it’s also a reset against drift, reflexivity, and a changing world. Prediction and performance don’t disappear, but they’re transient, bounded by how long the underlying assumptions remain valid.
That means all the AI companies that "retire" a model is not because of their new better model only, but also because of decay?
PS. I clean wrote above with AI, (not native englishmen)
For some reason he doesn't like doing mathematical demonstrations so he shuns the practice of doing them, and invented a new word to describe that way of using formal systems.
Not only did it take > 5 seconds to load a page, images were progressively loaded as fast as two at a time over the next minute or so - if there were no errors during transfer!
But as a metaphor for other creative pursuits, my experience is that most of the time when people are "planning" or working on other things that they like to believe will help them do the thing... they are really just avoiding doing the thing.
People spend years doing "world-building" and writing character backgrounds and never write the damn book. Aspiring musicians spend thousands collecting instruments and never make a song.
As you say, if it's just for fun, that's all fine. But if the satisfaction you want comes from the result of the thing, you have to do the thing.
Sometimes that's because they're making it worthwhile, by connecting the thing with those who will benefit from it and explaining how to use it, which is as valuable as doing the thing.
I.e. by making sure that they're doing the right thing.
>A human rubber-stamping code being validated by a super intelligent machine is the equivalent of a human sitting silently in the driver's seat of a self-driving car, "supervising".
So, absolutely necessary and essential?
In order to get the machine out of trouble when the unavoidable strange situation happens that didn't appear during training, and requires some judgement based on ethics or logical reasoning. For that case, you need a human in charge.
reply