This thread is full of people complaining that they just want the ability to set...

tptacek · on Aug 24, 2024

Nobody's gaslighting you. It's not impossible to build this, though it is much more difficult than it seems (cloud billing is a large-scale eventually-consistent distributed system, and if you've done any distributed systems work the issue with plugging a system like that directly into a control loop should be obvious). It's just expensive to build, and disproportionately serves the interests of customers who aren't running real apps.

At $25k/mo spend, you can talk to many cloud providers (certainly including us) to work out an "I can't get invoiced for more than $25k" solution which will not involve having your app turned off abruptly when the 2,500,001st cent get spent in October. What you'll notice in this thread is that people generally want billing caps for accounts they plan to spend, like, $10/mo on. And we can build cap systems for those people --- but they'll involve turning parts of the platform off for them.

I'm really pleading with people with strong opinions about hard caps to do the exercise of working out how these billing systems work. There are a huge number of apps running here, across a huge fleet of physicals, running in almost 40 regions around the world. Each of those apps has several different kinds of resources that meter at different granularities and incur different costs. Speaking as a witness to the creation of a new billing system just a month or two ago: it is kind of a miracle that these things work at all. Do the thought experiment, read some Call Me Maybe posts, and then tell me it's obvious that this feature should be straightforward to build.

veggieroll · on Aug 25, 2024

It seems I've committed the cardinal sin of failing to specify my units. So to clarify, it's $25k per year.

But my mistake that aside, I appreciate your reply. I saw in another comment thread that you added that you have a blog post in the pipeline on building the new billing system. So I'm really looking forward to reading that. I've enjoyed reading other billing/payments content. And I'm sure your post will also be insightful and highly detailed.

I think it's so so fascinating that this feature is consistently solved at the contract/legal and customer support "layer" of the stack. That's really unpleasant to me, because it is a lot harder for me to wrap my head around the specifics of how different edge cases will play out.

Like as a programmer, I've built up all these skills on reaching technical documentation, understanding systems, their limits, and complex interactions. But instead of using that muscle memory, I have to try to talk with a human and deal with the seemingly intentional vagueness of the legal system.

It feels to me a lot like the story from Mitchell Hashimoto about dealing with the bank for his startup, where he was dodging calls from his account executive and generally behaving in a way that the bank is not used to from enterprise clients. [0]

I'm ready to admit my behavior is anti-social and irrational here. But, it is what comes natural to me.

This is meandering now. But, I just want to sneak in a bit more info on my use cases, since you also mentioned that people are wearing you down and you just might implement this if forced to.

I build and run internal tools (think CRUD & reporting/analytics) for a small department in an extremely large enterprise. Our stuff is on the order of 99.9% available. So not particularly great but not not terrible. But, others are extremely bad. For example, one vendor has over 36 hours of scheduled downtime per month. And that system is way more critical to the business than mine.

So the standard that my coworkers in the department have come to expect is very low. If the tool is down, they just continue with their day doing some other task.

Many of the systems I manage are also purely background jobs. And no one would even notice if they were down for 12-24 hours.

Lastly, we have external backups for everything (on a different provider) and every system's deployment is automated from creating the VM's, networks, and block storage all the way through to installing system dependencies, the app, and data.

So, if a system were to magically get deleted some day, I'd get paged and have it back up in about an hour. And this is totally fine for our business.

On the other side though, there will be dire consequences for my career if we go over $25k annual spend. Even if the bill arrives and we have to contact support, it will give my management a heart attack and they will absolutely remember come review time.

Given this environment, I'd really appreciate the ability to protect myself against misconfiguration or leaked keys causing me to get possibly fired. The data will be fine. And systems can be restored quickly. But the damage to my reputation, compensation, and future job can't be restored quickly.

[0] https://mitchellh.com/writing/my-startup-banking-story

> Many of the systems I manage are also purely background jobs. And no one would even notice if they were down for 12-24 hours.