Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Last year o3 high did 88% on ARC-AGI 1 at more than $4,000/task. This model at its X high configuration scores 90.5% at just $11,64 per task.

General intelligence has ridiculously gotten less expensive. I don't know if it's because of compute and energy abundance,or attention mechanisms improving in efficiency or both but we have to acknowledge the bigger picture and relative prices.





Sure, but the reason I'm confused by the pricing is that the pricing doesn't exist in a vacuum.

Pro barely performs better than Thinking in OpenAI's published numbers, but comes at ~10x the price with an explicit disclaimer that it's slow on the order of minutes.

If the published performance numbers are accurate, it seems like it'd be incredibly difficult to justify the premium.

At least on the surface level, it looks like it exists mostly to juice benchmark claims.


It could be using the same early trick of Grok (at least in the earlier versions) that they boot 10 agents who work on the problem in parallel and then get a consensus on the answer. This would explain the price and the latency.

Essentially a newbie trick that works really well but not efficient, but still looking like it's amazing breakthrough.

(if someone knows the actual implementation I'm curious)


The magic number appears to be 12 in case of GPT 5.2 pro.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: