Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think they are mainly -dev and -schnell. Both models are 12B. -pro is the most powerful and raw, -dev is guidance distilled version of it and -schnell is step distilled version (where you can get pretty good results with 2-8 steps).


what does guidance distilled mean?

something about pro must be better than dev or it wouldn't be made API-only, but what exactly, how does guidance distilling affect pro it and what quality remains in dev?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: