Rumors said that GPT4.5 is an order of magnitude larger. Around 12 trillion parameters total (compared to GPT4's 1.2 trillion). It's almost certainly MoE as well, just a scaled up version. That would explain the cost. OpenAI also said that this is what they originally developed as "Omni" - the model supposed to succeed GPT4 but which fell behind expectations. So they renamed it 4.5 and shoehorned it in to remain in the news among all those competitor releases.
Appreciate the corrections, but I'm still a bit puzzled. Are they wrong about 4.5 having 12 trillion parameters, it originally intending to be Orion (not omni), or an expected successor to GPT 4? And do you have any related reading that speaks to any of this?
That sounds correct except for that total parameter count. 110B per expert at 16 experts puts you just shy of 1.8T. Are you suggesting there are ca. 30B shared params between experts?