I have been running a DevOps agency for the last 8 years and while each Clouds basically offers the same things at this point the two things that always trip me up are networking and IAM.
Some things I noticed as I have done work on AWS, Azure and Google in terms of IAM:
- Azure seems to have so many different types of IAM permissions it is sort of hard to get your head across each one as they seems to be imported from Azure, Active Directory, etc.
- Google differentiates between service accounts and user accounts which takes a bit of getting used to as each is different and the specific service policies that need to be granted are much harder to figure out than AWS.
- AWS now has three different IAM configurations including IAM, AWS IAM Identity Center, and roles. The complication is that AWS was not built with the Google nomenclature of projects in mind so it is a weird add on that causes all sorts of weird issues.
In terms of networking:
- AWS for me the simplest to grok but I have also been doing it for the longest so there may be a bias here. Everything is tied to a VPC. It also seems that AWS provides the lowest level primitives for networking versus the other providers which tend to abstract away quite a bit.
- Google's VPC (i.e network) is global across all regions which is nice for data locality as you can use the same VPC and put subnets across regions.
- Azure is similar to AWS but does seem to have a lot of hidden features that you need to read the docs to enable espcially around AKS+video streams.
IAM is the biggest miss that all of the cloud providers suck at. I think Google's is the best, but it really isn't a great experience. This seems like something that is so critical it should be rock solid and extremely clear, but too often I see things that get into these weird situation that's hard to predict exact access rules.
I’ve done lots with AWS and really only ever used GCP to configure Google SSO. I was really surprised by how much button clicking is required in GCP vs. AWS. In AWS, you create the root account, provision a service account, and then all AWS resources are managed through terraform. In GCP, you have to verify a domain via CNAME records, etc., in order to create a root account, and then manipulate the organization policy to provision the service account. While you can create the IAP brand within terraform (as long as you use the root account and not the service account), you can only externalize the brand by clicking buttons in GCP. Laughably, there is an open issue/ticket from more than a decade ago requesting a programmatic way to externalize a brand.
Really good answer in terms of how they "feel" to use.
Just one note, since there's a design decision Google and AWS made differently that feels nice but makes availability more precarious:
> "Google's VPC (i.e network) is global across all regions which is nice for data locality as you can use the same VPC and put subnets across regions."
It's also not uncommon seeing your entire global footprint go down when there's a network plane issue.
AWS — for the longest time — was fanatical about keeping services uncoupled across regions, leading to far fewer "global" outages.
Sadly, many customers complained, wanting services to be cross region, instead of having to replicate environments across regions. Fifteen years in, AWS is accommodating, allowing you to build services that span a couple regions and go down if either region is down.
If uptime is critical to you, in AWS leverage at least 3 AZs in each of at least 2 regions, and be sure you're using region-only services or a cross-region service that's really single region with a consistency scheme. You'll stay online through most regional issues.
Also note that the three define "region" quite differently. The AWS definition generally includes a variety of availability and resilience constraints, such as at least 2 AZs with enough physical separation to survive local physical outages. Looking closely comparing across them, you'll find AWS's resilience story is more stringent, the other two are somewhat more oriented to putting a pin on the map and call things regions that may be more like single POPs (points of presence).
All that said, it's becoming "less true" in both directions, as large customers complain when any two CSPs don't work similarly. The "voice of the customer" is asking for feature parity rather than exploiting the differentiation.
From our point of view, they're still differentiated enough a firm should consider using each for what it's best at, say AWS for lego blocks, Azure for business integration, and Google for scale-out analytics feeding ML/AI. Again, each is trying to shore up what the others already have in their DNA, but it's harder to copy something when it hasn't been your in-house bread and butter or you didn't invent it.
Some things I noticed as I have done work on AWS, Azure and Google in terms of IAM:
In terms of networking: