yup, after the token-increase from CC from two weeks ago, I'm now consistently f...

troupo · 2026-04-14T21:17:28 1776201448

They are now literally blaming users for using their product as advertised:

https://x.com/lydiahallie/status/2039800718371307603

--- start quote ---

Digging into reports, most of the fastest burn came down to a few token-heavy patterns. Some tips:

• Sonnet 4.6 is the better default on Pro. Opus burns roughly twice as fast. Switch at session start.

• Lower the effort level or turn off extended thinking when you don't need deep reasoning. Switch at session start.

• Start fresh instead of resuming large sessions that have been idle ~1h

• Cap your context window, long sessions cost more CLAUDE_CODE_AUTO_COMPACT_WINDOW=200000

--- end quote ---

https://x.com/bcherny/status/2043163965648515234

--- start quote ---

We defaulted to medium [reasoning] as a result of user feedback about Claude using too many tokens. When we made the change, we (1) included it in the changelog and (2) showed a dialog when you opened Claude Code so you could choose to opt out. Literally nothing sneaky about it — this was us addressing user feedback in an obvious and explicit way.

--- end quote ---

torginus · 2026-04-14T23:06:01 1776207961

Off topic, but I found Sonnet useless. It can't do the simplest tasks, like refactoring a method signature consistently across a project or following instructions accurately about what patterns/libraries should be used to solve a problem.

rhlsthrm · 2026-04-15T07:36:37 1776238597

It's crazy because when Sonnet came out it was heralded as the best thing since sliced bread, and now people are literally saying it's "useless". I wonder if this is our collective expectations increasing or the models are getting worse.

troupo · 2026-04-15T07:48:46 1776239326

Probably both :)

New models come out with inflated expectations, then they are adjusted/nerfed/limited for whatever reason. Our expectations remain at previous levels.

New models come out with once again inflated expectations, but now it's double inflation, because we're still on the previous level of expectations. And so on.

I think it's likely to get worse. Providers are running out of training data, and running bigger and bigger models to more and more people is prohibitively expensive. So they will try to keep the hype up while the gains are either very small or non-existent.

dr_kiszonka · 2026-04-14T21:12:51 1776201171

The default prompt cache TTL changed from 1 hour to 5 minutes. Maybe this is what you are experiencing.

varispeed · 2026-04-14T22:42:13 1776206533

I find this 1M context bollocks. It's basically crap past 100k.

_blk · 2026-04-18T18:27:56 1776536876

I like not running into the mandatory compaction but I do try to actively keep it under too. From an Anthropic standpoint with the new(ish) 5min cache timeout, it's a great way to get people to burn tokens on reinitializing the cache without having them occupy TPU time.. Esp. the larger the context gets.

robwwilliams · 2026-04-14T20:44:54 1776199494

Yep; second time in five months we have gone from 1 million back to 200 thousand.

_blk · 2026-04-14T20:56:53 1776200213

hmm, I just reverted to 2.1.98 and now with /model default has the (1M context) and opus is without (200k) .. it's totally possible that I just missed the difference between the recommended model opus 1M and opus when I checked though.