Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I have some questions/curiosities from a technical implementation perspective that I wonder if someone more in the know about ML, LLMs, and AI than I would be able to answer.

Obviously there's a reason in dropping the price of gpt-4o but not gpt-4t. Yes, the new tokenizer has improvements for non-English tokens, but that can't be the bulk of the reason why 4t is more expensive than 4o. Given the multi-model training set, how is 4o cheaper to train/run than 4t?

Or is this just a business decision, anyone with an app they're not immediately updating from 4t to 4o continues to pay a premium while they can offer a cheaper alternative for those asking for it (kind of like a coupon policy)?



GPT-4o is multi-modal but probably fully dense like GPT-2, unlike GPT-4t which is semi-sparse like GPT-3. Which would imply GPT-4o has fewer layers to achieve the same number of parameters and same amount of transformations.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: