I think you're crossing reliability and availability. Reliability means 99.9% of...

stavros · 2025-07-20T17:58:40 1753034320

An interesting comment I read in another post here is that humans aren't even 99.9% accurate in breathing, as around 1 in 1000 breaths requires coughing or otherwise cleaning the airways.

seadan83 · 2025-07-20T19:20:54 1753039254

I would say reliability is availability times accuracy.

(Your point remains largely the same, just more precise with the updated definition replacing 'reliable' with 'accurate'.)

vntok · 2025-07-20T11:49:53 1753012193

> Humans very much are 99.9% accurate

This is an extraordinary claim, which would require extraordinary evidence to prove. Meanwhile, anyone who spends a few hours with colleagues in a predominantly typing/data entry/data manipulation service (accounting, invoicing, presales, etc.) KNOWS the rate of minor errors is humongous.

satyrun · 2025-07-20T12:30:45 1753014645

Yea exactly.

99.99% is just absurd.

The biggest variable though with all this is that agents don't have to one shot everything like a human because no one is going to pay a human to do the work 5 times over to make sure the results are the same each time. At some point that will be trivial for agents to always be checking the work and looking for errors in the process 24/7.

seadan83 · 2025-07-20T19:09:52 1753038592

I wouldn't take the claim to mean that humans universally have an attribute called "accuracy" that is uniformly set to the value 99.9%.

The claim is pretty clearly 'can' achieve (humans) vs 'do' achieve (LLM). Therefore one example of a human building a system at 99.9% reliability is sufficient to support the claim. That we can compute and prove reliability is really the point.

For example, the function "return 3" 100% reliably counts the Rs in strawberry. We can see the answer never changes, if it is correct once therefore, it will always be correct because the answer is always the same correct answer. A LLM can't do that, and infamously gave inaccurate results to that problem, not even reaching 80% accuracy.

For the sake of discussion, I'll define reliability to be the product of availability and accuracy and will assume accuracy (the right answer) and availability (able to get any answer) to be independent variables. In my example I held availability at a fixed 100% to illustrate why being able to achieve high accuracy is required for high reliability.

So, two points: humans can achieve 100% accuracy in the systems they build because we can prove correctness and do error checking. Because LLM cannot do 100%, frankly, there is going to be a problem that shows a distinction between max capabilities. While difficult, humans can build highly reliable complex systems. The computer is an example, that all the hardware interfaces together so well and works so often is remarkable.

Second, if every step along a pipeline is 99% reliable, then after 20 steps we are no longer talking about a system that usually works, but one that _rarely_ works. For a 20 step system to work above 50%, it really needs some steps that are effectively at 100%