Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

What are the main shortcomings of the solutions you tried out?

We believe you need to both automatically create the evaluation policies from OTEL data (data-first) and to bring in rigorous LLM judge automation from the other end (intent-first) for the truly open-ended aspects.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: