Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This thread is full of people complaining that they just want the ability to set a hard billing cap. And yet, providers continue to insist that no serious customer wants this and/or it's impossible to implement.

What would it take for providers to listen to real customers here?

I have $25k in cloud spend that we absolutely cannot go one single cent over due to the politics of internal budgeting. That's my reality. If you want my $25k, I need to ensure that I don't spend more than this amount.

As is, my solution is to use old-school pre-rented, long-term contracted commitment VM hosting providers. This is really the only way to guarantee that you are paying an exact amount and no more.

But, I really would like to use a more scalable system that didn't require pre-provisioning. And, I wish people would believe the customer when they say something and not continue to gaslight us.

Providers say it's impossible, but I don't see how it would be so hard. Here's my sketch of how it could work:

The main component is a system that monitors billing events and watches for the slope of the bill to ensure that there is enough runway to stay under the cap. Optionally, they could implement rate limits on resource creation to ensure that a sudden surge doesn't outrun the monitoring.

You also need notifications for when the projected spend exceeds the cap. Optionally, you could implement a soft cap, where no new resources can be created.

And finally, you need the hard cap where things start to get deleted. If they're feeling generous, the provider could implement a period where VM's/lambda's/etc are shut off, blob storage is not accessible, and so on so that the account holder has some time to fund the account and/or fix whatever is causing the overage.

That set of features are all totally within the competency of a cloud provider. Knowing how much things cost, billing for them, and turning them on and off is their main business. And I can't believe that they expect us to believe that it's impossible to do that tracking.

This is how I do system migrations. There are escalating warnings until one day the service is shutoff for an hour and turned back on. That wakes up any laggards that missed the dozen or so communications over a three-to-six month period. Finally, after a few more days, the service is shutdown for good and then data is deleted a week after that. Though I almost always keep a copy in cold storage. But that isn't necessary as a provider with a limited relationship to the client.



Nobody's gaslighting you. It's not impossible to build this, though it is much more difficult than it seems (cloud billing is a large-scale eventually-consistent distributed system, and if you've done any distributed systems work the issue with plugging a system like that directly into a control loop should be obvious). It's just expensive to build, and disproportionately serves the interests of customers who aren't running real apps.

At $25k/mo spend, you can talk to many cloud providers (certainly including us) to work out an "I can't get invoiced for more than $25k" solution which will not involve having your app turned off abruptly when the 2,500,001st cent get spent in October. What you'll notice in this thread is that people generally want billing caps for accounts they plan to spend, like, $10/mo on. And we can build cap systems for those people --- but they'll involve turning parts of the platform off for them.

I'm really pleading with people with strong opinions about hard caps to do the exercise of working out how these billing systems work. There are a huge number of apps running here, across a huge fleet of physicals, running in almost 40 regions around the world. Each of those apps has several different kinds of resources that meter at different granularities and incur different costs. Speaking as a witness to the creation of a new billing system just a month or two ago: it is kind of a miracle that these things work at all. Do the thought experiment, read some Call Me Maybe posts, and then tell me it's obvious that this feature should be straightforward to build.


It seems I've committed the cardinal sin of failing to specify my units. So to clarify, it's $25k per year.

But my mistake that aside, I appreciate your reply. I saw in another comment thread that you added that you have a blog post in the pipeline on building the new billing system. So I'm really looking forward to reading that. I've enjoyed reading other billing/payments content. And I'm sure your post will also be insightful and highly detailed.

I think it's so so fascinating that this feature is consistently solved at the contract/legal and customer support "layer" of the stack. That's really unpleasant to me, because it is a lot harder for me to wrap my head around the specifics of how different edge cases will play out.

Like as a programmer, I've built up all these skills on reaching technical documentation, understanding systems, their limits, and complex interactions. But instead of using that muscle memory, I have to try to talk with a human and deal with the seemingly intentional vagueness of the legal system.

It feels to me a lot like the story from Mitchell Hashimoto about dealing with the bank for his startup, where he was dodging calls from his account executive and generally behaving in a way that the bank is not used to from enterprise clients. [0]

I'm ready to admit my behavior is anti-social and irrational here. But, it is what comes natural to me.

This is meandering now. But, I just want to sneak in a bit more info on my use cases, since you also mentioned that people are wearing you down and you just might implement this if forced to.

I build and run internal tools (think CRUD & reporting/analytics) for a small department in an extremely large enterprise. Our stuff is on the order of 99.9% available. So not particularly great but not not terrible. But, others are extremely bad. For example, one vendor has over 36 hours of scheduled downtime per month. And that system is way more critical to the business than mine.

So the standard that my coworkers in the department have come to expect is very low. If the tool is down, they just continue with their day doing some other task.

Many of the systems I manage are also purely background jobs. And no one would even notice if they were down for 12-24 hours.

Lastly, we have external backups for everything (on a different provider) and every system's deployment is automated from creating the VM's, networks, and block storage all the way through to installing system dependencies, the app, and data.

So, if a system were to magically get deleted some day, I'd get paged and have it back up in about an hour. And this is totally fine for our business.

On the other side though, there will be dire consequences for my career if we go over $25k annual spend. Even if the bill arrives and we have to contact support, it will give my management a heart attack and they will absolutely remember come review time.

Given this environment, I'd really appreciate the ability to protect myself against misconfiguration or leaked keys causing me to get possibly fired. The data will be fine. And systems can be restored quickly. But the damage to my reputation, compensation, and future job can't be restored quickly.

[0] https://mitchellh.com/writing/my-startup-banking-story

> Many of the systems I manage are also purely background jobs. And no one would even notice if they were down for 12-24 hours.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: