> Production systems need 99.9%+ reliability This is not remotely true. Think of...

navane · 2025-07-20T11:05:11 1753009511

It's not just about up time. If the bridge collapses people die. Some of us aren't selling ads.

vntok · 2025-07-20T11:40:59 1753011659

If "the bridge collapses and people die" because the team has a 1min26 "downtime" on a specific day, which is what you are arguing, then you have much bigger problems to solve than the performance of AI agents.

GeneralMayhem · 2025-07-20T16:46:27 1753029987

Uptime and reliability are not the same thing. Designing a bridge doesn't require that the engineer be working 99.9% of minutes in a day, but it does require that they be right in 99.9% of the decisions they make.

stavros · 2025-07-20T17:57:30 1753034250

Another way to think about it is that, if the engineer isn't right in 99.9% of decision, the bridge will have 99.9% uptime.

navane · 2025-07-20T22:54:46 1753052086

That's pretty bad for a bridge haha

Pasorrijer · 2025-07-20T10:57:48 1753009068

I think you're crossing reliability and availability.

Reliability means 99.9% of the time when I hand something off to someone else it's what they want.

Availability means I'm at my desk and not at the coffee machine.

Humans very much are 99.9% accurate, and my deliverable even comes with a list of things I'm not confident about

stavros · 2025-07-20T17:58:40 1753034320

An interesting comment I read in another post here is that humans aren't even 99.9% accurate in breathing, as around 1 in 1000 breaths requires coughing or otherwise cleaning the airways.

seadan83 · 2025-07-20T19:20:54 1753039254

I would say reliability is availability times accuracy.

(Your point remains largely the same, just more precise with the updated definition replacing 'reliable' with 'accurate'.)

vntok · 2025-07-20T11:49:53 1753012193

> Humans very much are 99.9% accurate

This is an extraordinary claim, which would require extraordinary evidence to prove. Meanwhile, anyone who spends a few hours with colleagues in a predominantly typing/data entry/data manipulation service (accounting, invoicing, presales, etc.) KNOWS the rate of minor errors is humongous.

satyrun · 2025-07-20T12:30:45 1753014645

Yea exactly.

99.99% is just absurd.

The biggest variable though with all this is that agents don't have to one shot everything like a human because no one is going to pay a human to do the work 5 times over to make sure the results are the same each time. At some point that will be trivial for agents to always be checking the work and looking for errors in the process 24/7.

seadan83 · 2025-07-20T19:09:52 1753038592

I wouldn't take the claim to mean that humans universally have an attribute called "accuracy" that is uniformly set to the value 99.9%.

The claim is pretty clearly 'can' achieve (humans) vs 'do' achieve (LLM). Therefore one example of a human building a system at 99.9% reliability is sufficient to support the claim. That we can compute and prove reliability is really the point.

For example, the function "return 3" 100% reliably counts the Rs in strawberry. We can see the answer never changes, if it is correct once therefore, it will always be correct because the answer is always the same correct answer. A LLM can't do that, and infamously gave inaccurate results to that problem, not even reaching 80% accuracy.

For the sake of discussion, I'll define reliability to be the product of availability and accuracy and will assume accuracy (the right answer) and availability (able to get any answer) to be independent variables. In my example I held availability at a fixed 100% to illustrate why being able to achieve high accuracy is required for high reliability.

So, two points: humans can achieve 100% accuracy in the systems they build because we can prove correctness and do error checking. Because LLM cannot do 100%, frankly, there is going to be a problem that shows a distinction between max capabilities. While difficult, humans can build highly reliable complex systems. The computer is an example, that all the hardware interfaces together so well and works so often is remarkable.

Second, if every step along a pipeline is 99% reliable, then after 20 steps we are no longer talking about a system that usually works, but one that _rarely_ works. For a 20 step system to work above 50%, it really needs some steps that are effectively at 100%

vrighter · 2025-07-22T08:30:06 1753173006

This comment makes the assumption that the software is cloud based and all that matters is uptime.

I used to work on a backup application, it ran locally on our clients' machines. We had over 10000 clients. A 99.9% reliability would mean that there are 10 of our customers, at any one point, having a problem. It's not a question of uptime. It's a question of data integrity in this case. So 99.9% reliability could even leave us open to, potentially, 10 lawsuits. Also, about 10 support calls per day.

Now we only had about 10k customers at the time. Imagine if it were millions.

lexicality · 2025-07-20T10:50:13 1753008613

Currently I'm thinking about how furious the developers get any time Jenkins has any kind of hiccough, even if the solution is just "re-run the workflow" - and that's just network timeouts! I don't want to imagine the tickets if the CI system started spitting out hallucinations...

hansmayer · 2025-07-20T10:37:29 1753007849

This may not be about internal business processes. In e-commerce 90 sec can be a lot of revenue lost, and mission-critical applications such as telecommunications or air control, it would be downright a disaster (ever heard of five nines availability)?

lerchmo · 2025-07-20T11:34:47 1753011287

Alot of deterministic systems externalize their edge cases to the user. The software design doesn’t fully match the reality of how it gets used. Ai can be far more flexible in the face of dynamic and variable requirements.