Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yes, I read it and specifically pointed it out (that's why there are 3 hours of interactive logs). There are 4 other runs pushed now so you can see what actual clean room runs for 5.2 xhigh, 5.3-Codex xhigh, 5.4 xhigh, and Opus 4.6 ultrathink look like: https://github.com/lhl/claudecycles-revisited/blob/main/COMP... as well as the baseline.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: