> the research team I work with has about 1/20th the compute that a Google researcher has. This gives them a massive advantage
The search space could easily be so large that a less disciplined approach might yield fewer useful outcomes even with the compute advantage. Being forced to be focused and disciplined might actually be a big advantage to you.
It’s hard to be disciplined about a black box though. That’s one reason why we’re all speeding off at a thousand miles per hour on transformers - the architecture works, why try other things?
To be fair I don't think anybody saw the boom in LLMs coming from the initial Attention paper. At the time it was one of many ideas that sounded like they had potential.
The search space could easily be so large that a less disciplined approach might yield fewer useful outcomes even with the compute advantage. Being forced to be focused and disciplined might actually be a big advantage to you.