Is a model so huge that’s only at the level of GPT 3.5 actually good? That seems...

fwlr · on March 17, 2024

OpenAI is valued at 90 billion and all they do is make GPT; Twitter is valued at 40 billion and this was essentially a vanity side-project by a cowboy CEO. Presuming that benchmarks and general “it’s about the level of 3.5” is accurate, it’s inefficient, but not incredibly inefficient imho

thekhatribharat · on March 18, 2024

xAI is a separate entity, and not a X/Twitter subsidiary.

pelorat · on March 17, 2024

> Twitter is valued at 40 billion

WAS vaulued at 44B.

Now?

Maybe 5 billion.

alvah · on March 18, 2024

LOL @ $5 billion, but if it that was the valuation, you'd be making parent's point stronger.

wongarsu · on March 18, 2024

Last I heard they lost 15% of their users, so let's call it 36 billion.

dilyevsky · on March 18, 2024

They weren't even 44B when elon took the keys - he specifically tried to back out of the deal because 44B was insane peak '21 asset bubble price. In truth they were probably like 10-15B at that moment. And now that bunch of advertisers left due to we know who it's probably about 10B

mceachen · on March 18, 2024

More like $13b.

https://arstechnica.com/tech-policy/2024/01/since-elon-musks...

wraptile · on March 18, 2024

Twitter didn't have direct competitors other than Mastodon when it was taken at 44B. Now there's Threads, Bluesky and bigger Mastodon.

_ea1k · on March 18, 2024

Honestly, none of those look like meaningful competitors at the moment.

squigglydonut · on March 18, 2024

None of these matter

Lewton · on March 18, 2024

twitter was valued around 30 billion when musk tried getting out of buying it (then the market cap went up when it became clear that he would be forced to pay full price)

cma · on March 17, 2024

Since it is MoE, quantized it could be able to run on cheaper hardware with just consumer networking inbetween instead of needing epyc/xeon levels of PCI-e lanes, nvlink, or infiniband type networking. Or it could even run with people pooling smaller systems over slow internet links.

drak0n1c · on March 17, 2024

It’s designed to be actively searching real-time posts on X. Apples and oranges.

grey8 · on March 17, 2024

Why is that relevant to the size?

Post search on X is done as it is with any other data from any other source, you use RAG and function calling to insert the context.

< 7B open source models can function call very well. In fact, Nous Hermes 2 Pro (7B) is benchmarking better at that then GPT-3.5.

Not related to the size, if I'm not mistaken.

pests · on March 18, 2024

Isn't that... the same thing as search?

hn_20591249 · on March 17, 2024

The data pipeline isn't included in this release, and we already know it is a pretty simple RAG pipeline using qdrant, https://twitter.com/qdrant_engine/status/1721097971830260030.

Nothing about using data in "real time" predicates that the model parameters need to be this large, and is likely quite inefficient for their "non-woke" instructional use-case.

lmeyerov · on March 18, 2024

Agreed. We have been building our real-time GPT flows for news & social as part of Louie.AI, think monitoring & and investigations... long-term, continuous training will become amazing, but for the next couple of years, most of our users would prefer GPT4 or Groq vs what's here and much smarter RAG. More strongly, the interesting part is how the RAG is done. Qdrant is cool but just a DB w a simple vector index, so nothing in Grok's release is tech we find relevant to our engine.

Eg, there is a lot of noise in social data, and worse, misinfo/spam/etc, so we spend a lot of energy on adverserial data integration. Likewise, queries are often neurosymbolic, like on a data range or with inclusion/exclusion criteria. Pulling the top 20 most similar tweets to a query and running through a slow, dumb, & manipulated LLM would be a bad experience. We have been pulling in ideas from agents, knowledge graphs, digital forensics & SNA, code synthesis, GNNS, etc for our roadmap, which feels quite different from what is being shown here.

We do have pure LLM work, but more about fine-tuning smaller or smarter models, and we find that to be a tiny % of the part people care about. Ex: Spam classifications flowing into our RAG/KG pipelines or small model training is more important to us than it flowing into a big model training. Long-term, I do expect growing emphasis on the big models we use, but that is a more nuanced discussion.

(We have been piloting w gov types and are preparing for next cohorts, in case useful on real problems for anyone.)

xcv123 · on March 18, 2024

According to their benchmarks it is superior to GPT-3.5