Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

yeah, that's a very big caveat - haven't checked neo 20b yet. I've had a hard time getting the AI21 models to use it and those are also pretty big so it's interesting why sometimes it works and sometimes it doesn't. (and Davinci > Codegen Davinci > Curie > J-6B). Fine tunes can also learn to do the inner monologue as well which is really cool - not sure how much is architecture vs. training parameters.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: