I think Rust is great for agents, for a reason that is rarely mentioned: unit te...

0x3f · 2026-03-02T19:22:48 1772479368

You can add a callback to e.g. Claude to guarantee it does a cargo check and test.

unshavedyak · 2026-03-02T19:32:09 1772479929

Fwiw i used to do this (and with lints) - it was the only way to make Claude consistent in the early days when i first started using it (~August 2025).

For many months now though, Claude is nearly consistent with both calling test and check/clippy. Perhaps this is due to my global memory file, not sure to be honest.

What i do know, is that i never use those hooks, i have them disabled atm. Why? Because the benefit is almost nonexistent as i mentioned, and the cost is at times, quite high. It means i cannot work on a project piecemeal, aka "only focus on this file, it will not compile and that's okay", and instead forces claude to make complete edits which may be harder to review. Worst of all, i have seen it get into a loop and be unable to exit. Eg a test fails and claude says "that failure is not due to my changes" or w/e, and it just does that.. forever, on loop. Burns 100% of the daily tokens pretty quick if unmonitored.

Fwiw i've not looked to see if there's an alternate way to write hooks. It might be worth having the hook only suggest, rather than forcing claude. Alternatively, maybe i could spawn a subagent to review if stopping claude makes sense.. hmm.

0x3f · 2026-03-02T23:28:50 1772494130

I find this doesn't work automatically for me because the projects I'm on have a lot of conditional compilation feature flags that it doesn't quite understand how to cargo check properly, unless I tell it.

Maybe for your case you could create a /maybe-check command, and run that in the hook? Then specify the conditions under which a check/test is needed in there.

overfeed · 2026-03-03T06:35:53 1772519753

> if you explicitly ask agents to write/run tests, after a while agents just forget to do that

Add a single task using your project's preferred task-runner that performs all the checks you want the agent to adhere to: linting, test coverage, style checks, test, etc, and add a rule in AGENTS.md that agents should always run this tasks after edits, and fix any warnings or errors produced.

Add the same task to your version management's pre-merge checks, in case the agent (or colleague) forgets to check before pushing. This was good practice since before LLMs, but I never was a fan of having such checks to pre-commit hooks.

jimbokun · 2026-03-02T23:10:31 1772493031

Even LLMs know they should write tests but hate doing it.

wakawaka28 · 2026-03-02T21:08:43 1772485723

Unit tests in the same file wastes context and makes the whole thing hard to navigate for humans and machines alike.

dnautics · 2026-03-02T21:15:19 1772486119

nah, the agents jump around files anyways.

J_Shelby_J · 2026-03-02T21:21:32 1772486492

I’ve been doing the least amount of unit tests possible and doing debug asserts instead.

0x3f · 2026-03-02T23:30:05 1772494205

Normally I would put as many invariants in the types as possible, then tests cover the rest. I'm curious how you do this/what you use it for though. Would be cool if you had any examples.

jimbokun · 2026-03-02T23:11:09 1772493069

It’s about the best possible documentation.

wakawaka28 · 2026-03-02T23:36:37 1772494597

It isn't documentation. It is example code, in the best case. That shit belongs in other files, not in the main file. There is also a reason why literate programming never took off in general. Good luck getting anything done when 80% (conservatively) of the stuff you have to scroll through contributes nothing to the actual execution of the program and might actually be giving you false impressions of how things need to be done.

g947o · 2026-03-03T11:47:51 1772538471

I have yet to see a single Rust file where the test comes before source and takes 80% of the file content.

wakawaka28 · 2026-03-03T14:25:34 1772547934

Probably because all the tests are trivial, and people have the bias to not add all the testing that is needed inline with the code.