Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Inference takes 30+ seconds before it starts outputting even the first token.


How's recall over 1M tokens?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: