At that scale, a complete outage is unlikely. I have services which haven't gone...

chrisandchris · on March 12, 2021

Said OVH. Then their datacenter burned to the grounds.

Said Oracle. Then their DNS was misconfigured and their whole cloud went offline for 2 hours [1].

Shit happens, always, at all scales.

[1] https://ocistatus.oraclecloud.com/incidents/qjxllgkywysj

Edit: typo

andrewaylett · on March 12, 2021

OVH didn't suffer a complete outage. If you were relying on that single DC, then you're probably not sufficiently large for this to apply to you.

But perhaps my point wasn't clearly enough made: a claim of "100% uptime" on a service level isn't particularly _useful_ when our users still only see a 99.9% success rate.

chaz6 · on March 12, 2021

I think the weak point is their domain name. I think cloud providers should have a second domain, with a different registrar and managed sompletely independently, so that if one is subject to a problem (hijacking, dns outage, etc) clients can fallback to the alternative domain.

sp8 · on March 12, 2021

That literally happened, they blogged about it recently. https://www.backblaze.com/blog/recent-outages-why-we-acceler...