The problem is that the training data doesn't contain a lot of "I don't know".

pegasus · 2025-10-31T12:49:45 1761914985

The bigger problem is that the benchmarks / multiple-choice tests they are trained to optimize for don't distinguish between a wrong answer and "I don't know". Which is stupid and surprising. There was a thread here on HN about this recently.

astrange · 2025-10-31T21:21:47 1761945707

That's not important compared to the post-training RL, which isn't "training data".