hmm, I just reverted to 2.1.98 and now with /model default has the (1M context) and opus is without (200k) .. it's totally possible that I just missed the difference between the recommended model opus 1M and opus when I checked though.
There are different level of who gets locked in. Almost every health care system in the USA is locked in to either an Epic/Oracle barrel or a Cerner barrel. I hope AI breaks this duopoly open soon.
Also running gemma-4 on Apple M5 Max. As fast or faster than Opus 4.6 extended but not of course the same competence. However, great tunability with llama.cpp and no issues related to IP leakage.
With Gemma-4 open and running on laptops and phones I see the flip side. How many non-HN users or researchers even need Opus 4.6e level performance? OpenAI, Anthropric and Google may be “rent seeking” from large corporations — like the Oracles and IBMs.
Agree: it is Anthropic's aggressive changes to the harnesses and to the hidden base prompt we users do not see. Clearly intended to give long right tail users a haircut.
Yes, and over the last few weeks I have noticed that on long-context discussions Opus 4.6e does its best to encourage me to call it a day and wrap it up; repeatedly. Mother Anthropic is giving preprompts to Claude to terminate early and in
my case always prematurely.
as someone who uses deepseek, glm and kimi models exclusively, an llm telling me what to do is just off the wall
glm and kimi in particular, they can't stop writing... seriously very eager to please. always finishing with fireworks emoji and saying how pleased it is with the test working.
i have to say to write less documentation and simplify their code.
LLMs are next token predictors. Outputting tokens is what they do, and the natural steady-state for them is an infinite loop of endlessly generated tokens.
You need to train them on a special "stop token" to get them to act more human. (Whether explicitly in post-training or with system prompt hacks.)
This isn't a general solution to the problem and likely there will never be one.
Very cool. An evolutionary biologist would say: Welcome to the party!
Mutation rate modulation is the AI engineers’ heat. And selection does the trimming of the outliers.
Some more serious biomorphic thinking and we may get to the next big insight courtesy of 3+ billion years of evolution—- evolution that enabled a great ape species to write a paper like this and build LMM’s like Gemma4 that totally rock on a 3.5 pound MacBookPro M5 Max with 128 GB of RAM.
Yes, especially with shifts in focus of a long conversation. But given the high error rates of Opus 4.6 the last few weeks it is possibly due to other factors. Conversational and code prodding has been essential.
reply