That drives up valuations in the short term and results in a very easy time to raise capital (aka Gold Rush). However what follows is a period where the market will turn the screws really hard on companies to show they have a viable and economically justifiable business or they’ll get washed out real quick.
We’re not in a bubble like dot com but it will get very uncomfortable for companies that don’t have a clear path to self-sustaining economics.
Maybe not exactly the same bubble, but it's hard not to make parallels.
Pets.com, one of the most comical cases of 2000s, was worth $400M at peak.
When Chewy.com went public, they dwarfed that number with a new record of $13B.
The analogy was so obvious, that CNBC wrote an article on that [1].
After that $CHWY managed to go as high as $118/share, translating into a market cap of roughly $50 Billion in December 2021. Now they are trading at about half that price.
Isn’t the fact that they got cut in half (and other examples, peloton for example) evidence that this isn’t a bubble. There are companies that are exploding in value, yes, but also companies just getting crushed left and right. If you’re growing, your value is very high, but miss your targets and boom, bye bye froth.
Having been in or associated with many database startups, I would not invest in one. You need to evangelize a product on which the life of someone’s project or company depends. in such a situation, you want boring technology that just works. Hard to imagine a new feature, or performance win, that justifies the vastly increased risk of obtaining it.
I'm not sure I agree with this - there have been many postgres features over the past ~10 years that got me excited, some of the highlights that immediately spring to mind:
- JSON / jsonb
- improved partitioning support
- identity columns
I agree that I wouldn't want to trust a new entrant to the space unless it was subject to rigorous testing, eg https://jepsen.io/
Postgres has the advantage that it's still open-source, can be ran locally and self-hosted (and has a healthy ecosystem of managed providers) and won't go away anytime soon. None of these are true for all these new DB startups.
> You need to evangelize a product on which the life of someone’s project or company depends.
I've actually seen the results of a database product being picked, the company behind which ceased to exist. It wasn't pretty, since at one point the product ceased to be supported and therefore neither any updates/fixes were made, it wasn't available in the repositories for new OS distros and eventually even the documentation for it went offline. Having to support a system that integrated with it was an unpleasant experience, all the way to it being eventually replaced with something else instead.
Therefore, it probably makes a lot of sense to base something as critical as your data storage layer on proven technologies that have demonstrated that they'll probably be supported in one form or another for the following years or even decades, unless you have a good reason for choosing something else.
Those reasons might deal with particular workloads or requirements, e.g. clustering solutions for PostgreSQL/MySQL/MariaDB/..., geospatial extensions, solutions to integrate with it through REST interfaces or even GraphQL or something like that, with a stable and proven piece of software still at the core of it all.
In case anyone is wondering, the product in question was Clusterpoint, about which you can read a bit more here: https://en.wikipedia.org/wiki/Clusterpoint a NoSQL database that actually predates MongoDB by a few years, as far as i know. Of course, now it seems like even their homepage is offline.
I think that open source database startups (or even ones that do data management like Elasticsearch) a big problem (other than finding traction obviously ). One is if you release your source code it’s very easy for someone else to just copy the straight forward business model of selling it as SaaS.
The second one would be keeping your promises or even competing entrenched databases.
Even when you have gotten this far a way that you can exit is to get one enterprise customer convinced to use it so badly they must acquirer you.
The biggest lie I have seen across databases is actual time travel for your data. Or in other words given a date and time in the past it gives you what the data was then.
On the business model side of things, I think the most important thing to think about is the threat of the cloud providers. In the same way that MSFT were able to take most of the profits from the 90s/00s PC business, and Apple were able to take most of the profits from the appstore, and Amazon takes most of the profits from the Amazon market place, in the same way the cloud providers are realistically going to take the vast majority of the profits from running your DB in the cloud. Elasticsearch has exactly this problem. If they're lucky, Amazon will buy them. If they're unlucky Amazon will just steal their tech and take the profits (some would argue this already happened). They're by and large living in someone elses ecosystem.
> The biggest lie I have seen across databases is actual time travel for your data. Or in other words given a date and time in the past it gives you what the data was then.
I worked at a company where we built this for our internal HBase clusters. Given any timestamp (or even better a commit ID) we could restore the cluster to that point in time. This was done through a combination of backing up the store files + write ahead logs. As I recall we had a retention limit for how much time we could do this for - but they were high enough for all of the "oh crap" moments.
That seems like it would burn a lot of space on store file snapshots to replay the relevant WAL atop.
Did they integrate with an underlying filesystem's snapshotting feature to make the store file snapshots differentials, or was that also implemented in the database? Or just threw space at the problem?
Thus the reason they have created new licenses that poison the well for companies trying to offer SaaS services using mongo or elastic as two examples or just GPL for Neo4j
A big problem is that get funding outside of USA is not that easy, and here in Colombia this kind of projects is not considered "profitable" by investors.
But still convinced exist a lot of untapped potential in this area!
The first is RethinkDB [1], which was a darling for awhile but couldn't seem to find a business model. I really wonder if it came later how much money it could've raised. There were claims made that the cloud killed RethinkDB. This past year seems to fly in the face of that theory.
The second is: is this just database "success" (as measured by funding rounds and valuations) or is this just the general case with all startups? I honestly don't know.
Still, I wasn't aware of all these players and it was a good post so thanks for that.
Thanks a lot! I have never worked with RethinkDB specifically, but I can assure you - it's only one of many names in the DBMS graveyard. Most have probably died because of inability to monetize open-source software.
For DBMS startups, which don't offer an order of magnitude improvements, the problem is convincing any one to replace a fundamental piece of infrastructure. For disruptive startups the problem is even bigger. As soon as you go open-source, all of your competitive technological advantage is gone.
This is the first I’m hearing of Unum. Their benchmarks are impressive. I’ve often wondered how much we are held back in performance by all the layers of abstraction on data storage (db, os, filesystem, disk layout) and it seems they’re able to significantly beat the incumbents, partly by collapsing down the layers.
I’m not sure what BLAS has to do with a key-value store.
> it seems they’re able to significantly beat the incumbents, partly by collapsing down the layers.
Hahaha, thanks for the kind words! Not sure if I could describe our methodology better)) We originally come from and hope to continue working on AGI technologies (Artificial General Intelligence).
Writing a key-value store and, subsequently, a DBMS came out of necessity, as other solutions seemed too bulky, slow, outdated and expensive. So we invested time into building UnumDB for internal use, but after first benchmarks, it was clear, that this must be available to broader audience. We will polish it in the next couple of months and make it deployable in public clouds, just like other DBMS brands.
Now DBMS is our central focus, but with essential R&D out of the way, we will rebalance our priorities and continue working on BLAS and sparse neural nets.
Anyone know about RavenDB? Came across them years ago. Seems neat, but never heard of anyone using them. Also has been a for profit company for a while but not sure if they use venture or self sustaining
Yeah a lot of .NET shops backed raven due to it's integration with LINQ at the time. It was actually very powerful and feature rich. Haven't touched raven in 8 years, although I have checked out their time series functionality.
It was on the longer list of smaller DBMS brands, but I was afraid to make the volume incomprehensible. Maybe we will write a more focused version, covering just smaller startups in 2022.
That drives up valuations in the short term and results in a very easy time to raise capital (aka Gold Rush). However what follows is a period where the market will turn the screws really hard on companies to show they have a viable and economically justifiable business or they’ll get washed out real quick.
We’re not in a bubble like dot com but it will get very uncomfortable for companies that don’t have a clear path to self-sustaining economics.