I have some questions/curiosities from a technical implementation perspective th... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		ComputerGuru on May 13, 2024 \| parent \| context \| favorite \| on: GPT-4o I have some questions/curiosities from a technical implementation perspective that I wonder if someone more in the know about ML, LLMs, and AI than I would be able to answer. Obviously there's a reason in dropping the price of gpt-4o but not gpt-4t. Yes, the new tokenizer has improvements for non-English tokens, but that can't be the bulk of the reason why 4t is more expensive than 4o. Given the multi-model training set, how is 4o cheaper to train/run than 4t? Or is this just a business decision, anyone with an app they're not immediately updating from 4t to 4o continues to pay a premium while they can offer a cheaper alternative for those asking for it (kind of like a coupon policy)?

anticensor on May 16, 2024 [–]

GPT-4o is multi-modal but probably fully dense like GPT-2, unlike GPT-4t which is semi-sparse like GPT-3. Which would imply GPT-4o has fewer layers to achieve the same number of parameters and same amount of transformations.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact