Yes, openAI is dumping the market with chat-gpt 3.5. Vulture capital behaviour a...

sacred_numbers · on Sept 12, 2023

Based on my research, GPT-3.5 is likely significantly smaller than 70B parameters, so it would make sense that it's cheaper to run. My guess is that OpenAI significantly overtrained GPT-3.5 to get as small a model as possible to optimize for inference. Also, Nvidia chips are way more efficient at inference than M1 Max. OpenAI also has the advantage of batching API calls which leads to better hardware utilization. I don't have definitive proof that they're not dumping, but economies of scale and optimization seem like better explanations to me.

csjh · on Sept 12, 2023

What makes you think 3.5 is significantly smaller than 70B?

hutzlibu · on Sept 12, 2023

I also do not have proof of anything here, but can't it be both?

They have lots of money now and the market lead. They want to keep the lead and some extra electricity and hardware costs are surely worth it for them, if it keeps the competition from getting traction.

haxton · on Sept 12, 2023

gpt3.5 turbo is (mostly likely) Curie which is (most likely) 6.7b params. So, yeah, makes perfect sense that it can't compete with a 70b model on cost.

JackRumford · on Sept 14, 2023

These sites say 154B:

https://www.ankursnewsletter.com/p/gpt-4-gpt-3-and-gpt-35-tu...

https://blog.wordbot.io/ai-artificial-intelligence/gpt-3-5-t...

why_only_15 · on Sept 12, 2023

gpt3.5 turbo is a new model, not Curie. As others have stated, it probably uses Mixture of Experts which lowers inference cost.

csjh · on Sept 12, 2023

Is there a source on that? I've never seen anyone think it's below even 70B

ronyfadel · on Sept 12, 2023

It still does a much better job at translation than llama 2 70b even, at 6.7b params

two_in_one · on Sept 12, 2023

If it's MOE that may explain why it's faster and better...

yumraj · on Sept 12, 2023

sarthaksrinivas · on Sept 12, 2023

Mixture of Experts Model - https://en.wikipedia.org/wiki/Mixture_of_experts

jiggawatts · on Sept 12, 2023

I thought it was fairly well established that GPT 3.5 has something like 130B parameters and that GPT 4 is on the order of 600-1,000

avion23 · on Sept 13, 2023

I remember:

- gpt-3.5 175b params

- gpt-4 1800b params

PUSH_AX · on Sept 12, 2023

You think they are caching? Even though one of the parameters is temperature? Can of worms, and should be reflected in the pricing if true, don't get me started if they are charging per token for cached responses.

I just don't see it.

why_only_15 · on Sept 12, 2023

You can keep around the KV cache from previous generations which lowers the cost of prompts significantly.

read_if_gay_ · on Sept 12, 2023

turbo is likely nowhere near 70b.