Human multi-step workflows tend to have checkpoints where the work is validated before proceeding further, as humans generally aren't 99%+ accurate either.
I'd imagine future agents will include training to design these checks into any output, validating against the checks before proceeding further. They may even include some minor risk assessment beforehand, such as "this aspect is crucial and needs to be 99% correct before proceeding further".
That's what Claude Code does - it constantly stops and asks you whether you want to proceed, including showing you the suggested changes before they're implemented. Helps with avoiding token waste and 'bad' work.
The standard way to use Claude Code is with a constant-cost subscription; one of their standard website accounts. It’s rate-limited but still generous.
You can also use API tokens, yes, but that’s 5-10x more expensive. So I wouldn’t.
> You can also use API tokens, yes, but that’s 5-10x more expensive. So I wouldn’t.
100% agree as someone that uses API tokens. I use it via API tokens only because my work gave me some Anthropic keys and the directive "burn the tokens!" (they want to see us using it and don't give a crap about costs).
This is going to depend on what you're doing with it. I use Claude code for some stuff multiple times a day, and it is an unusual for a session to cost me $0.05. Even the most expensive thing I did ended up costing like $6, and that was a big and intensive workflow.
The size of the code base you are working in also matters. On an old, large code base, the cost does go up, though still not real high. On a new or relatively small code base, it is not unusual for my requests to cost a tenth of a cent. For what I am doing, paying with an API key is much cheaper than a subscription would be
yes, definitely. I've seen comments from people burning through $100s of tokens per day for a $200 per month subscription. Anthropic has just been cracking down and restricted token limits for restrictions, expect more to come, it's just not sustainable
Exactly, if the program has less than 100 units or so of logic then it’s going to be a pretty good time so long as you aren’t working with very obscure/private dependencies.
The problems begin when integrating hundreds of units prompted by different people or when writing for work which is both prolific and secret. The context is too limited even with RAG, one would need to train a model filled with secret information.
So basically non-commercial use is the killer app thus far for Claude code. I am sure there are some business people who are not happy about this.
Lots of applications have to be redesigned around that. My guess is that micro-services architecture will see a renaissance since it plays well with LLMs.
Somebody will still need to have the entire context, i.e. the full end-to-end use case and corresponding cross-service call stack. That's the biggest disadvantage of microservices, in my experience, especially if service boundaries are aligned with team boundaries.
On the other hand, if LLMs are doing the actual service development, that's something software engineers could be doing :)
I'd imagine future agents will include training to design these checks into any output, validating against the checks before proceeding further. They may even include some minor risk assessment beforehand, such as "this aspect is crucial and needs to be 99% correct before proceeding further".