I've always resisted the change to GHA, preferring instead tried and tested Jenkins and docker workflows. Sure, multi-agent GHA is nice, but when it's down, what do you do? We don't even work in offices to have swordfights in anymore.
I agree here, looking now it seems that I would need three pairs of hands to count how many times Actions have gone down every year. It feels like Actions itself is not ready for production use.
Problem is who will maintain and test it. Without DR (disaster recovery) test days its worth nothing cause something will have been changed and not propagated
Use the script in your regular CI/CD pipeline. Just add arguments for things that vary between GitHub runners and local machines, like secrets and paths.
It's certainly a risk to be relying on any proprietary SaaS or Cloud but I don't really get the reason to be doing CI/CD as if it can never be down. There are a lot of risks in having credentials on hot spare alternative systems you rarely need and the number of times I've really had to rollback a deployment quickly is something I can count on one hand. Isn't it better to focus on quality to not need to deploy on a moment's notice?
You can keep the credentials in a password manager. Presumably the sysadmin already uses the same corporate PM for his regular credentials, so this doesn't increase risk much.
We're a small org but I'd be pretty annoyed if we planned a deployment on a day and ended up having to wait for GitHub to go back up. Deployments can be disruptive and we have to coordinate with other teams.
Keep in mind how likely is it that you'll be able to maintain a better uptime when self-hosting and what's your expected MTTR. Github results may not be that bad.
In all my time, every internal downtime combined was better than every external downtime combined, and the worst internal downtime was way better than the worst external downtime.
This shit tends to just work, unless you fiddle with it. And if you do fiddle with a self-hosted version, you can do it after-hours, not in the middle of the day.
You got it exactly right. Every time I use an external service, there is times when its down. Every time its an internal service, there are zero times its down outside of maintenance.
Its why I continue to favor running everything in house (preferably on-prem too),
I read this trope so many times, but actually there's some things to consider on the other side as well.
In reality even GMail had more outages than my self-hosted email for example.
The other is that with self-hosted services and a somewhat sane setup most outages would come from updates. I wouldn't be surprised if that's the case for GitHub Actions too, but usually updates, especially for non public things not accessible for the public are planable.
Self-hosting often allows for very simple setups, and simple things don't break as easily as something that is huge and constantly being changed, due to growth, trying to have more features than the competition, etc.
Self-hosting you theoretically could just have an SSH server with repos on it and no other services running. That thing will likely have a higher uptime than GitHub. You can use gits web interface, which still likely results in a higher uptime. And depending on your needs you can use simple CIs. These things compared to SSH, cgit and some webserver are probably the least stable, but something like buildbot is also pretty stable. And if you don't tinker on it all the time things will remain that way.
Of course if you want/need all the features of GitHub and its competitors you can get that, but depending on what you do it might not be true, simply because self-hosted solutions tend to be a lot less complex. That at least includes all the things that make them commercial SaaS products.
In other words: If it was all of GitHub just doing the hosting that self-hosted solution only for one customer/you that's something where I'd say that GitHub or others would likely do a better job, but for a person/non-huge company self-hosting there's just so many counter-examples. Maybe GitHub is maintaining a better uptime, maybe not.
Also plain MTTR is not what you are looking for, because if you can't (easily) get out a critical update or something for a presentation, because GitHub just made an update that really sucks and you can literally do nothing other than to hope - maybe build a workaround.
If you are having a critical update to push out you likely won't likely do an update of your CI-infrastructure at the same time.
Of course you really should have a plan B for pushing out updates, regardless of the situation. Sadly, that's not the case in many companies. GitHub or other services having outages are considered higher power, while self-hosted solutions being down are usually viewed as failure.
And since we are on biases already I want to emphasize the tinkering part. A lot of us are developers and infrastructure engineers (SRE, DevOps, sysadmins, etc.). We are paid to do and tinker on stuff, but the reality is that we use tons of services that effectively never cause issues. Be it OpenSSH, nginx, apache or others. When was the last time even doing a major upgrade any of these caused issues? Finished releases of software that is built to do one thing and doesn't try to do everything usually works extremely well, often good enough that even a complete novice doing that on the side will achieve higher uptimes.
I think the same is true for simple CI-pipelines.
This is not meant to convince anyone to not use managed solutions or GitHub, but to not end up with wrong assumptions, because the comparisons we do in our heads aren't what we actually want to compare. Self-hosting CI pipelines doesn't mean running GitHub.
We don't think we are able to run a better restaurant, because we are cooking our own food, yet cooking ones own food might be a more reliable approach to not getting hungry, than guessing that the restaurant is always open.
But whatever you do, be sure to keep a backup solution in your freezer. ;)
It is for most startups, but not megacorp which has the money anyway. I’ve never gotten close to hitting the CI minutes cap in companies < 100 engineers.
The cap is 2000, will 50 engineers for example really only need 40 minutes of build time each in an entire month? That’s like 5-10 builds, depending on your build duration.. just one or two a week or so. You might use all that up in a single pull request.
In a really productive month my two person startup had to buy more minutes.
I could have sworn it was more, but fair enough! I do heavily optimize CI to reduce minutes though, with easy access to local tests (and git hooks to enforce these) and more thorough tests ran nightly on merged branches.
Gitlab, Redox and ARM all all on custom Gitlab deployments. They were probably intended as examples of self-hosting companies, just with confusing wording.
I found this so hard to get working. Actually I didn't even get it working, I think I had some dependencies in my build that weren't in ACT or something.
Always have an established and regularly tested workflow for manual deployments that doesn't rely on CI jobs. You'll need it sooner or later.