I think that many people look at Microservices from a purely technical standpoin...

ex_amazon_sde · on July 7, 2020

Ex Amazon SDE here. Amazon uses a service based architecture where one team (e.g. 4 to 8 engineers) maintains one service.

The main benefit is organizational and lies in having a small and very measurable interface between services.

Crucially, the team the develops the service runs it and provides 24/7 on-call for it. This also include taking decisions on how much to spend on hardware VS optimization as the team is responsible for the overall performance.

More importantly, the team is now responsible for service outages.

This is very different from having a team write many microservices. It's especially bad when a team writes a bunch of microservices (the so-called distributed monolith) and somebody else has to run them.

Sadly the article does not focus on this aspect.

As a side note, once developers realize they get paged out of bed by their own code... it's amusing to see how the cool framework of the month is not cool anymore.

WJW · on July 7, 2020

At my previous company this was introduced as well, but the incentives were very warped. There was no advantage to actually being all that stable, since then you missed out on "firefighting glory". I never saw praise or promotions for maintaining high uptime. I'd be very interested to learn if Amazon managed to overcome this and if so, how.

ex_amazon_sde · on July 7, 2020

I cannot talk for the whole company but I don't think I witnessed a culture of "firefighting glory", rather the opposite.

There is a formal process to investigate and correct errors after a non-trivial incident that requires collecting evidence and then discuss what happen, when, why and who did what with the whole team.

Then people ask the question "why was this not prevented or forseen?" and keep going backward until you exit the technical realm and look into people's choices.

The root causes could be, for example, that it was a management decision to prioritize something else over availability concerns... and the engineers are off the hook.

Sometimes one engineer was overworked and tired and did an error in good faith that is too difficult to prevent and avoid, and that's also acceptable.

Sometimes that the whole team ignored an availability issue in the product architecture, and that's bad.

And so on... Needless to say the outcome can impact people's performance review.

fulafel · on July 7, 2020

Also AWS services are much more standalone products than an components of an inhouse/LOB system consisting of custom built microservices.

ex_amazon_sde · on July 7, 2020

No, the large majority of services in the SOA are unrelated to customer-facing AWS services and they are inhouse/LOB/custom.

pragmaticdev · on July 7, 2020

It's difficult to put all of the nuances in a blog post. As a developer, I'm a firm believer in "run what you brung".

ex_amazon_sde · on July 7, 2020

The blog post is very long and predicted at 18 minutes of reading, yet the word "deploy" appears only once. Plus there is no mention on who has to run it.

gbrits · on July 7, 2020

Comment as bookmark

pjmlp · on July 7, 2020

My first microservices were written using Sun RPC, the company whose motto was "The network is the computer".

Since then we have seen plenty of microservices solutions come and go, what stays is the complexity of dealing with distributed programming.

Anyone that isn't able to write modular code using the language features is going to write network spaghetti code, with the added benefit to debug network packets and connection issues on top of everything else.

extra_rice · on July 7, 2020

> Anyone that isn't able to write modular code using the language features is going to write network spaghetti code, with the added benefit to debug network packets and connection issues on top of everything else.

This is the most important bit. To me microservices is just another way to arrange your system into module, not much different in concept to modules and packages. Regardless of whether you’re dealing with monolithic or distributed system, if your abstractions are poor, your work and your system will suffer.

tluyben2 · on July 7, 2020

> Regardless of whether you’re dealing with monolithic or distributed system

Agreeing somewhat, but not totally ; monoliths are much easier to test and stabilise. All bigger microservices projects I know (personally, not what I read from Uber etc) have tremendous overheads compared to their monolithic counterparts. Some of them are really nicely done, but there still is a lot of overhead in support (usually the services are written by different people (and possible different languages/environments, for instance .net core, .net framework (for Windows specific services) + Typescript/node), so other people need quite a large ramp up to get into them), networking, monitoring, etc. As far as I have seen, it needs a larger team.

And it is very different from having you modular built monolith in your IDE for a spin, than doing a little (local/dev) testrun of 25 microservices + the one you are working on. I can feel the benefits but I have yet to encounter them in real life; so far I have seen (smaller; ~100m/year rev) companies going back to monolith and most of the stories I read are from very large companies with massive development+deployment teams where you actually can have 3+ people per microservice.

If you are working with 3-4 people managing 20+ (changing, forming a complex system), I don't really seen it happen (but would love to see practical examples of that) while monoliths of the same complexity/functionality (literally; I worked on a few 'just because scaling!' rebuilds from monoliths->MS(->back)) have no issues at all with that.

extra_rice · on July 7, 2020

I’m not really arguing which one is easier to run, maintain, or anything; I was just making a point about abstractions and system modularisation.

Hypothetically, I should be able to take a well-designed monolith and “deconstruct” it into a distributed system by deploying modules individually and replacing function calls with network calls. Going the opposite direction, I should be able to do the reverse with a well-structured distributed system.

This pattern of modularisation exists across the software stack; it goes as far as the digital logic level with lumped-element abstraction. In general, we trade-off some benefits with a more complex messaging mechanism, akin to threads sharing the same memory space VS processes communicating through sockets or files. In general these are all technical details that help us accomplish what we need our system to do, but pinning the right abstractions helps us actually reason about our system.

tluyben2 · on July 7, 2020

> Anyone that isn't able to write modular code using the language features is going to write network spaghetti code,

Languages (libraries) should support this more because people are doing it. I don't want to be bothered by this complexity etc for most of the project; I will architect it properly but the language features should help me in this case.

Some new languages/envs are exploring this, like [0] and Erlang ofcourse is an example of naturally working (or having to work) like that.

In many other environments we work it's mostly annoying; very possible but a chore instead of the language/environment helping me. We roll out quite large deployments anyway; given the choice I rather would see a nicely architected monolith.

[0] https://www.unisonweb.org/

pjmlp · on July 7, 2020

Apparently Ada and Modula-2 aren't fashionable, given how long ago languages and libraries support this.

I could pick more recent examples, just wanted to make a point how old the concept goes.

ak39 · on July 7, 2020

Good explanation. But I have two questions:

1) So which "team" then takes ownership of the entire katamari ball of sprawling microservices and understands the interconnectnedness and nuances of overlaps and competition between each microservice?

2) If the concept of technical debt at yesterday's monolithic architectures is biting us now, what does the future technical debt of microservices hold?

asim · on July 7, 2020

1. No one has ownership over all the services, that's shared ownership. A team might own the platform and be responsible for up time, but individual teams own their services. Somewhere higher in the engineering org there's explicit ownership of end to end product and its usually tied with that product team or again the platform team.

2. As you scale, sometimes its easier to create a new service than it is to go modify that old service. When a company scales fast or employees churn its either rewrite that service or leave it where it is and build something new. This can lead to something that looks a lot more organic than computational. I have only seen the likes of Google handle deprecation well, to the dislike of most people external but survival of the fittest is harsh at best.

tluyben2 · on July 7, 2020

in both 1 & 2, I only see money pits... This sounds like so much overhead... Par for the course for enterprises, but I see too many startups now deploying 50+ microservices and suffering with the overhead (hosting, maintenance, employee-count/churn/training, monitoring etc).

krab · on July 7, 2020

> in both 1 & 2, I only see money pits... This sounds like so much overhead...

That's the point. The overhead is high but the other options to employ extra developers bring even more overhead. And the company wants extra developers because the faster (or more parallel) development still has a good ROI.

If you want low overhead, a small team is the way to go.

hinkley · on July 7, 2020

But what happens is you end up with a ton of them, fulfilling the same prophecy you were trying to avoid, or you keep trying to cram more “volume” behind the same surface area, until you eventually tear a hole in the fabric of space time and/or your business.

The way to succeed is to do fewer things but make them count. Except that looks awful on your annual review, so the real way to succeed is to Defect. Be seen doing as much as you can and get out before they figure out you consumed all of the oxygen in the room.

sriku · on July 7, 2020

I think John Ousterhout's recommendation of 'thin' interfaces and 'thick' implementations is a good heuristic that's worked well in my experience. This acts as a force that resists service proliferation that hides behind "modularity".

https://web.stanford.edu/~ouster/cgi-bin/cs190-spring16/lect...

hinkley · on July 7, 2020

What I'm saying is that this works great for a while. But if you want to keep it working, four years in, you still have to say No a lot.

asim · on July 7, 2020

Good points and it feels like this is what most people miss. They take microservices from a purely technical perspective and then rant about its complexity. You don't need a large scale distributed systems platform when you're a small team or have no scaling requirements. Microservices is largely driven through that organisational need of teams having to ship products/features independently and its just easier when the communication is via a service API vs humans or explicitly when walking over each others code.

ak39 · on July 7, 2020

If communication friction and overhead is the primary thrust for microservices (where you obviate said overhead by keeping to small teams), how do you address and manage the need for communication between teams?

Obviating communication costs within teams is one thing, but what of the overhead and costs of communication for the whole system? Eg. how do you avoid the cost of deploying a new team that ends up deploying a duplicate microservice?

jmchuster · on July 7, 2020

I would say that, if you're at a point where teams are deploying duplicate services with no idea of each other, you're already so large with so many teams, that you might desire that as the correct outcome. You now have hundreds of teams running in parallel, and there's no point in coordinating with everyone, just coordinate with the ten that you are adjacent to. So, yes, the company is not maximally efficient, but it can take advantage of parallelism, and that's ultimately much more important for speed of progress. I could see an argument that it'd take months to get all the teams to all meet and agree and coordinate on what this shared service should be, when you could have built two variants simultaneously in a couple of weeks.

asim · on July 7, 2020

These things are rarely binary. In that there's no one size fits all solution. It comes down to the culture of the company and how they communicate and then how the technology evolved based on their needs. So if you have an engineering org which practices high bandwidth communication there might be a drive to keep that up by coordinating a meeting once a week by nominating one person from the team to go provide updates and learn about what else is happening.

You may also explicitly establish an architecture team that has oversight and tries to ensure everyone's aligned on direction. But mostly if you at least have some shared place where you explicitly communicate the creation of new services or design development then it allows you to avoid some of the issues you're talking about. The other thing is having a way to discover APIs and services so you can go check if something exists before building it. In many cases teams do still build a variant themselves because they need something slightly different and thats ok if they're willing to support it.

jon-wood · on July 7, 2020

Communication between teams is handled by team leads talking to each other - where I’m at we have standups each day within teams, and then every other day those are immediately followed by team leads having another standup where the focus is on what teams are doing, rather than what individuals are doing.

It’s not perfect, but it’s a lot more efficient than the previous state of every individual developer needing awareness of what every other individual developer is working on at any given moment.

jensneuse · on July 7, 2020

You're making a point that Microservices help teams to solve the problem of increasing communication needs for larger teams. While your statement makes a lot of sense I'd like to ask why this problem cannot be solved with packages or classes within a monolith? A team can own a Microservice and expose an api to others the same way they could expose an api on a class or package, no? Microservices are a way of deploying small parts of software. It's not about communication across teams. It can help with communication but this is not only true for Microservices.

jmchuster · on July 7, 2020

I believe it could be done via classes or packages, but only if the providers have the discipline to design it like a rest service, and the users then have the discipline to only use it like a rest service. Each package needs a small, consistent interface, with no way to interact except through that interface, with that same style of consistency across all libraries in the same way. The code must be isolated at runtime, so it can't steal resources from other pieces of code in the monolith or cause anything to crash. It can only respond to requests along the interface, either successfully or unsucessfully. So, theoretically possible, but in practice difficult once you add humans to the mix.

livesinhel · on July 8, 2020

You can use ArchUnit[1] or similar frameworks to impose the constraints.

[1]: https://www.archunit.org/

soonnow · on July 7, 2020

So you are in principle of course correct. But keep in mind that Microservices can be deployed independently of each other. With Monoliths there is a single deployment that causes a lot of synchronization issues. Every team needs to deploy in lockstep.

This is all theoretical to some degree, because I have seen teams deploy Microservices in banks during the deployment windows and I have had people explclaim they have ONE Microservice.

So you can definitely do it, but Microservices have other advantages, as in how easy can you then scale the components, how resource hungry are individual services and so on.

BlackFly · on July 7, 2020

> Every team needs to deploy in lockstep.

Team A can finish their part of joint effort before Team B and get deployed on Monday, then Team B finishes their effort and gets deployed on Tuesday: this looks the same in both microservices or monoliths. You can have blue green deployments with monoliths with zero downtime.

Why does it look the same? Because you have to be careful with how you modify the interfaces in any case. If team A is requiring a new field from Team B to be not null immediately, then the Monday deployment breaks in either case. If team A uses a feature toggle or just does the old thing when the new field is null, that has nothing to do with microservices vs monoliths. If you are using a monorepo in both cases the synchronization story is the same: master needs to build and run without error.

The real distinction is what the Monday and Tuesday deployments look like in either case. With monoliths, the Monday deployment and the Tuesday deployment look identical: the whole monolith goes up and down. With Microservices, team A can be responsible for their deployment on Monday and it can be completely distinct from what team B does on Tuesday. If team A is team B, then the gain was a minor allowance in the deployments being different which might be a net loss.

Actually, when team A and team B are in lock step, then the monolith allows changes to be deployed that would otherwise be considered breaking if deployed in a microservice architecture. However, a lot of monoliths will still be clustered and might talk to each other across the network, killing this advantage.

politelemon · on July 7, 2020

The reason that I've seen (therefore anecdotal) is even if you attempt to cater for handoffs with packages/libraries/classes/APIs, eventually a point comes where some of these don't exist in a vacuum. Or rather, they exist in a vacuum until they don't.

For example a change in one library might cause an unexpected degradation in an unrelated application. The communication at that point is strained as there are now multiple teams firefighting and playing the blame game. Or a slight change in one library caused deployments to fail for everyone. In a perfect world with perfect engineering practices this wouldn't happen but as many have pointed out, development is a reflection of the business, and the business is not perfect.

Having mini-services, so even just taking a monolith and separating it out into manageable chunks, reduces the communication problem immensely. Doesn't do away with it completely, but goes a long long way!

pjmlp · on July 7, 2020

As if the blame game doesn't take place with microservices as well.

It is specially "funny" if the microservices are developed by several contractors.

tluyben2 · on July 7, 2020

And they often are! Because that's mainly one of the advantages... You can simply outsource a spec of the endpoints and that's it => flop over a ourservice.swagger.yml and get a working system back. Possibly in a language/framework you team has no experience with, but it was cheap.

Recent fun; a bunch of microservices written by the same contractor were killing the shared db (the 'state' of all microservices); so while most were running fine, some were hitting the db hard and costing the company quite a lot of money until it was fixed. No-one felt it was their responsibility to test & fix because it was an external company and they 'delivered' according to spec. There are many ways to do this better, but it happens quite a lot (also with monoliths ofcourse but there I find it faster/easier to diagnose + fix (depending on the size; I guess there is a point somewhere in LoC where microservices start to shine).

oftenwrong · on July 7, 2020

Yes. For example, a service starts sending bad data due to a bug. Now other services have acted on it, stored it, etc. It becomes a problem for many people.

jayd16 · on July 7, 2020

This assumes the teams are using compatible frameworks, build systems, or even languages. A web API is often the easiest common denominator.

Then you get into deployment orchestration. Not every org can handle pure CI/CD. Even if they could they'd have to all agree to it.

pragmaticdev · on July 7, 2020

It's much more difficult to evolve with technology or run experiments and you'd better make sure you pick the right frameworks and tech stacks upfront.

oftenwrong · on July 7, 2020

It is possible to keep teams "on islands" with a modular monolith. Each team builds a module. Logging, monitoring, and other shared infrastructure is there for all teams.

soonnow · on July 7, 2020

Until you come to the deployment phase where all teams need to communicate and deploy in sync. This can be a significant factor for slow down.

Also Microservices don't sync on the datastore, these are (in theory) not shared.

So yes Microservices can be emulated in Monolithic architectures, as can Monolithic Microservices be built. Usually you end up with the worst of both worlds. The complexity of Microservices with the speed of Monoliths.

soonnow · on July 8, 2020

Here is an article where I look at this issue in more detail https://thenewstack.io/microservices-vs-monoliths-an-operati...

mytailorisrich · on July 7, 2020

Organisation and communication between teams is improved by having good APIs.

You can have that with or without microservices. You can screw this up with or without microservices.

The main factors for going for microservices are technical but, as always, must be understood.

sandGorgon · on July 7, 2020

you can do islands of communications in a monolith as well, with the added advantage of significantly lower deployment complexity as well lower incidental complexity

collyw · on July 7, 2020

I view a working microservices application as a monolith with unreliable network connections between components. So overall adding a lot more complexity for the promise of lowers costs and scale-ability.

One advantage that I do see is the ability to deploy components individually, but I see more downsides than benefits overall. I hope that consensus is becoming "start with a (modular, well designed) monolith and scale out the parts as needed".

tluyben2 · on July 7, 2020

And (much) lower deployment costs, lower staffing costs, and lower development costs... In my experience anyway.

pragmaticdev · on July 7, 2020

This is only true for trivially small products. The pipeline complexity for multi-team monoliths can consume an entire dev team to maintain.