I completely agree — open-source models and custom deployments just can't compete with the cost and efficiency here. The only exception here is if open-source models can get way smaller and faster than they are now while maintaining existing quality. That will make private deployments and custom fine-tuning way more likely.
Or FOSS models remain the same size and speed, but hardware for running them, especially locally, steadily improves till the AI is "good enough" for a large enough segment of the market.
Probably some combination of all the above! I think 1 and 2 are interlinked though — the cheaper they can be, the more they build that moat. They might be eating the cost on these APIs too, but unlike the Uber/Lyft war, it'll be way stickier.
I actually expect open source models will be small _but larger than they are today_ because phones and laptops will get dedicated chips and software for running eg the best open source (weights?) model
So eventually you could be running decent sized models locally (iOS could even provide an API with fine tuning etc)
We've been working on making data teams more productive with Aqueduct for over a year, and we're really excited to share what we've been building.
There's a large (and growing!) number of programmers in the world who understand data and can solve business problems but don't want to spend their time wrangling low-level cloud infrastructure to get their work into the cloud. The existing MLOps tools that claim to solve this problem have been built by & for software teams, and they're incredibly complicated.
With Aqueduct, we've built a tool that's designed for data teams and abstracts away the underlying infrastructure. Aqueduct has a simple Python API that allows you to define a workflow as a composition of Python functions. Those workflows can be easily connected to data sources and can be run anywhere from your laptop to a Kubernetes cluster in the cloud. Once a workflow's running, Aqueduct has lightweight hooks to compute metrics and run tests over your pipelines to ensure they're producing high-quality results.
To learn more about what we're building, check out our GitHub repo or join our community Slack: