I co-founded startup 6 months ago, since day 1 we use micro-services. For us the...

cookiecaper · on March 26, 2017

>they could build a small parts of the system communicating via http/RabbitMQ.

<Cue horrified twitching>

Now, it's totally possible that you're using RMQ completely correctly, but as someone who has seen multiple teams fundamentally misunderstand both the purpose and function of an AMQP server and lose critical data because of it, any mention of RMQ as a primary part of the application's communication mechanism ruffles my feathers.

Sometimes I wonder if the RMQ team is aware of how many people end up using it grossly improperly. It seems they'd put some bigger warning labels on it if they were.

mafro · on March 27, 2017

I've not seen this kind of misuse. Could you be specific about the mistakes you've seen?

cookiecaper · on March 27, 2017

They inject it into the normal data processing/RPC workflow. Instead of writing to a database or some other permanent storage, they just have the application write directly to a RMQ queue, and wait for a worker to pick it up and store it somewhere.

AMQP is asynchronous and the queues can get choked up, so sometimes messages will be delayed for hours, and if the queues get too large, RMQ will begin to evict messages and/or crash due to insufficient resources.

RMQ will throw all the data in your queue away on restart unless you explicitly ask it not to by setting durable mode (don't forget to do this or all your data is down the hole when you're misusing it this way). Even in durable mode, the queue does not provide the type of safety guarantees would be expected of a real storage solution, which RMQ does not pretend to be, but somehow people still believe it is.

Because RabbitMQ does not provide strong data safety, crash, or resilience guarantees, it should never be the system-of-record for important data.

Furthermore, because the nature of AMQP is to dispose messages as soon as they're picked up (and yes, I know you can use acknowledgements to try to hack around this, but it's not something to trust for the only place where your data is recorded), it's very easy to accidentally black hole messages while everything appears to be working. This can lead to an insidious type of data loss where some records just appear to be mysteriously missing and are very hard to trace.

While RMQ states these things in its documentation and obviously is not intended to be the system of record, people still abuse it in this way. Redis was abused similarly and responded by growing a full featureset to turn it into a reliable in-memory storage solution. It doesn't seem like that's a good track for RMQ to take, so they should stick some giant red warning labels all over their page, clearly embarrassing those who make these dangerous choices.

mafro · on March 28, 2017

Thanks for elaborating. Certainly assuming that your data is safe in RMQ is a mistake!

mylons · on March 26, 2017

I don't think using micro-services from day 1 is a wise use of resources. What if one of your engineers, who probably owns an entire service, quits? And it's in Haskell because she felt like it?

vojant · on March 26, 2017

Well we have a list of approved stacks and yes this may became a problem one day but we try to prepare for that.

One of the things we do on the top of using micro-services is using docker for everything (dev, build and run on production), it's easier to pick up a project if the whole environment is already setup inside docker image. Like I wrote in my first post, It's very hard to find e.g. 10 good engineers knowing python in a month time (and we are startup, can not pay twice more than everyone else). So far this whole strategy work out, but I am not deluded and I do not pretends that starting with micro-services is a good decision for everyone in every use-case.

staticassertion · on March 26, 2017

This is a totally nutty scenario. Microservices are not "do whatever you want!" - they give you the freedom to choose your stack. It is obviously still a business decision to choose Haskell and you've got other problems if developers are building things in random languages that they feel like using without a larger discussion.

The issue here isn't Haskell it's ownership and process.

_d8fd · on March 26, 2017

Nutty or not, it happens. I was a Ruby dev and was asked to work on a Scala app. I could contribute to Ruby stuff quite nicely, and could hardly figure out how to compile the Scala app. Trade-offs...

staticassertion · on March 26, 2017

I'm sure it happens - my point is that it isn't relevant to microservices. Microservices allow multiple technology stacks, if dont properly, but it doesn't force it on anyone. If your developers are pushing code using a different programming language with no oversight there's an organizational issue.

That said, it's awesome that microservices let you use different tech stacks to solve different problems.

_d8fd · on March 26, 2017

Were you experienced with micro services before starting your latest thing? When starting a project, you're going to have constraints: money, time, available talent, management's blessing, etc. I'd guess those constraints are probably a driving factor in dictating how all of the project's needs are wired together.

If the technical founder was a Python/Flask/micro-service/Angular/MySQL dev, that's probably what they'd be using to knock out code to build an MVP. If the founder was a Microsoft-MVC/C#/Postgres/Ember/monolith dev, I'd be super surprised if the MVP was a Python/Flask/micro-service/Angular/MySQL app :)

mylons · on March 26, 2017

Given how they described their org, it seems like a valid scenario. Extreme examples are tools for illustrating a point.

Micro services from day one is a premature optimization. Just saying that doesn't necessarily paint a picture of why it might be.

boomlinde · on March 27, 2017

IMO microservices from day one aren't necessarily a premature optimization, or an optimization at all. It is sometimes just the natural way to model a solution.

For example at my last job, we developed several services that constantly generated reports for our clients to run. Instead of embedding the functionality to move these files to other machines in each service, we developed a separate service that monitored a directory to do only that. This meant that the reporting services were more open ended: clients could decide how to handle the files but were still left with a very convenient option. It also meant that we could hand off new versions of the transfer service on its own for customers to install without interrupting reporting services, and only having to deal with the documentation of the transfer service itself.

In terms of scalability or performance, it added absolutely nothing, but it simplified deployment, documentation and development from day 1.

vojant · on March 26, 2017

Oh I didn't mention all the reason's why we use micro-services from the day one. We have very sophisticated use-case. We call-in to meetings using hangout, go2meeting etc to record them (we do speech recognition and many other thing with these recordings). In our case to concurrently call-in to many meetings and do processing in real-time it wasn't really premature optimisation.

cookiecaper · on March 27, 2017

I agree, but I don't think microservices are properly classed as an optimization of any sort, premature or not. Microservices arise because a company can't communicate/manage itself internally.

This does not mean that you must have one giant 50MB executable to run your whole company, but it probably does mean most companies shouldn't have 60 200-line microservices.

boomlinde · on March 27, 2017

I think that microservices may be an actual optimization when the application flow has several clearly separable tasks that have varying requirements and you need to divide the load over several machines. For example, one task may be mostly I/O heavy, another will use a lot of RAM and a third may mostly be CPU bound. When you distribute the load over multiple servers, microservices can make it easier to tailor each server to the needs of the services it runs. The I/O bound workload doesn't need 100GB RAM and the CPU bound workload may not need several gigabit interfaces.

That said, I haven't personally worked with a microservice-based architecture where this ever became a useful optimization. Often it is exactly as you say: a technological workaround for an organizational problem.

staticassertion · on March 26, 2017

Microservices do not feel like a premature optimization. It does not feel premature to make architectural decisions around scaling based on the entirely realistic proposition that you will have more customers in the future than you do today. Architecture is exactly the area you want to get right since it's a pain to optimize when you have a weak architecture.

100k · on March 26, 2017

I heard second hand that this really happened at Living Social. Engineers would write services in whatever they felt like (sorry, I mean, "the best tool for the job"), then get bored and leave.

Heck, this was even a fad for a while: polyglot programming.

staticassertion · on March 26, 2017

Polyglot programming isn't "a fad" - it's something that microservices enable. That does not mean that developers make technical decisions in isolation.

If I were to introduce Haskell to my company there would have to be at least one other person interested in it and at least a few people who would be interested in learning it. I would never commit code using a new technology without discussing that with my manager.

emperorcezar · on March 27, 2017

Polyglot isn't a fad. That's like saying, as a carpenter, using more than a hammer is a fad. As a professional, you are supposed to have more than a single tool. That doesn't mean you use all of them on every job, but you should have them.

vojant · on March 26, 2017

Exactly, every decision is approved by me at the moment and we keep list of approved stacks.

Also to be honest I am not that afraid if we keep our micro-services really MICRO the worst-case scenario would be rewrite single micro-service - still better than struggling to hire dev team by 3-4 months.

verletx64 · on March 26, 2017

Surely the CTO factors this into signing off on any decision on what the engineers use to build the service. You can have an ecosystem comprised of different languages absolutely, but that doesn't become "Billing Automation is built in Idris because Brian wanted to try it out".

hiram112 · on March 27, 2017

We used Microservices at my last job, and initially, everyone used the same Maven archetype to create a basic Java application of similar structure: property files, environment variables, filters, etc. It wasn't as sexy as, say, Scala or Node or Haskell. OTOH, there were absolutely no developers who didn't already know Java or could learn our framework in a few months - at least enough to follow and be productive.

This came in handy because everyone could easily jump from one service to another and figure things out pretty easily since, not only was it all in the same language, but also the configurations and bootstrap classes were the same. Once someone figured out something clever, it was easily added to the base classes for everyone else to inherit.

Eventually some of the newer hires got bored of Java and wanted to use things like Node, Python, etc.

At the time I left, it was a pure clusterfuck. It was impossible to write once and easily change the whole filter stack (e.g. an auth filter) since we were now up against at least 5 different languages / frameworks. Developers couldn't easily jump from project to project as needed, either.

Anyway, most of our Microservices were just glorified DB -> JSON Crud apps. Microservices themselves were probably not needed for our customer size - we would have been fine with a 2008-style multi-war project on JBoss.

And yes, even Java, is pretty damn good at spitting out CRUD / JSON data to front-ends. Absolutely no need for the complexity introduced by the half dozen other frameworks.

mpfundstein · on March 26, 2017

Cool story. Our startup (8 engineers) is built on microservices as well. Right now, (after 1 year live, 4 hours downtime until then) we have 43 microservices running. We have a rabbitmq broker for fire-and-forget communications. We use Consul for Service Discovery. Jenkins is used for automatic testing and deployment. I have to say that I love our setup. While there are some things that are a bit more complicated, the increased efficiency is worth it all. On some days we deploy 10 or more times to production.

I have to say. The startup is very well funded and we have a dedicated SysOps guy who helps us with DevOps. He does all the nitty gritty Ansible stuff.

I think the best thing about micro-services is that it enforces service boundaries around aggregates. You just can't leak responsibility if you have to traverse the network. This enforces loose coupling.

Its also much easier to do code review if changes are localized to a small codebase of maybe max 500 lines.

ebiester · on March 26, 2017

Since you seem to have it figured out... I'm a monolith guy* in a microservices world. Friday, I had a question asked of me that in our old system was a simple query and I could answer in 5 minutes. This question, however, was split across three separate microservices in the new system, and the information had never been captured in a convenient way in hadoop. (Running two separate queries in two separate systems and writing a program to collate the results takes significantly more developer time. While not insurmountable, it's no longer a trivial task.)

Did you run into these problems? What did you build to solve them?

mpfundstein · on March 26, 2017

We have a statistics service for our ML guys. If it needs aggregated data from data stored in different services, the services must implement an API and the statistics layer can then make use of it. Usually one developer will do all of this. We don't have strict ownership.

What we also do is, we duplicate a lot of data into a separate event log. This log is used as a readonly store for analytics. This has the nice additional property that we have an event-log. The micro-services don't write directly to the log, but they can publish events to the broker and a dedicated event-service is used to append the data to the correct log.

Yes. Sometimes all of this is significant overhead, because everything has to be tested as well. But mostly when you start a service, the first thing is to provide all the necessary CRUD routines. Which is 99% of the time a few lines of code as every service is mostly build around at most one aggregate.

mooreds · on March 26, 2017

Plus one for the real world experience!

Would love to hear from you in a year and see plusses or minuses you have found with a bit more experience.

vojant · on March 26, 2017

One day I plan to share every detail of our journey, but at the moment we are too busy building our product. So far I am very happy with the decision to go with micro-service oriented architecture, but of course we had some issues with that - but hey there is no perfect solution!