> the research team I work with has about 1/20th the compute that a Google resea...

jahewson · on Aug 4, 2024

It’s hard to be disciplined about a black box though. That’s one reason why we’re all speeding off at a thousand miles per hour on transformers - the architecture works, why try other things?

jumpCastle · on Aug 4, 2024

Attention was invented because Bengio lab had to be disciplined about a black box (google had more compute)

exe34 · on Aug 4, 2024

whoever comes up with the next trick could win big.

lallysingh · on Aug 4, 2024

Whoever chooses the next winning lottery numbers could win big...

exe34 · on Aug 5, 2024

that's kinda my point, we don't know what the next winning move will be.

jack_pp · on Aug 4, 2024

Could, or meta makes you irrelevant with by reproducing your trick and giving it away for free

kranke155 · on Aug 4, 2024

Yann LeCun certainly seems to be one of the more interesting people in the space. He’s a notable skeptic of Llm intelligence and incredibly smart.

exe34 · on Aug 4, 2024

To be fair I don't think anybody saw the boom in LLMs coming from the initial Attention paper. At the time it was one of many ideas that sounded like they had potential.