Microservices — architecture nihilism in minimalism's clothes
November 2, 2020 8:07 PM   Subscribe

Some recent backtracking from what we have been calling “Microservices” has sparked anew the debate around that software architecture pattern. It turns out that for increasingly more software people, having a backend with (sometimes several) hundreds of services wasn’t that great an idea after all.
posted by geoff. (42 comments total) 42 users marked this as a favorite
 
Loving this article so far

having a ready answer when the thinking gets tough is a soothing lie that just moves complexity about. There is no substitute to the effortful application of cognitive power to a problem.
posted by armoir from antproof case at 8:18 PM on November 2, 2020 [18 favorites]


having a ready answer when the thinking gets tough is a soothing lie that just moves complexity about. There is no substitute to the effortful application of cognitive power to a problem.

Yeah, I think that line did it for me too. He also touched on the plugin approach I personally like. I've always thought it was much more difficult to build software for other developers and the best software architectures I've encountered make it really easy for other developers to extend and update quickly and with confidence they aren't breaking a larger system. youtube-dl is a great example of a simple, well thought out Python project that was able to effortlessly host hundreds of plugins. It was really obvious where to find the plugins, and where to update or add a. new one.

I'm also a big fan of event sourcing which I guess is sort of a micro-service architecture? It has built in granular logging. Usually it is tightly coupled with a message bus of some sort. I've seen systems I built 10 years ago still running fine with it.

I don't know, I sort of like any model that mimics the robustness and chaos of the Internet. If people are able to create and deploy software quickly even if it isn't the way you intended and they're able to do so with a high degree of confidence, that's a sign of a good software architecture.
posted by geoff. at 8:39 PM on November 2, 2020 [3 favorites]


As someone who has been writing network services since 2004, this whole microservice fad can not die quick enough. I am sick of getting into arguments over why splitting each ML model to its own microservice is a bad idea when it's the composite that provides function in the business domain. I am 100% behind modular, testable software, but if I am writing a dsp engine I don't need every sin and cos call behind an http call. Besides, if you write your modules correctly, turning them into a service is trivial.

The article makes great points, and while others have quoted some good points, to me the real meat to me is this:

weakening dependencies between different parts of our systems is a “shift-right” loan with high interest;

Honestly, your architecture should reflect your business domain, and almost serve as close to a DSL as can be.
posted by The Power Nap at 9:18 PM on November 2, 2020 [10 favorites]


Here at work we seemed to have solved the problem with "macro" services: services that need dozens of gigabytes per instance to run, multiplied out at scale. You cannot physically cram them all into a single address space. And thus macroservices lives on.
posted by pwnguin at 9:24 PM on November 2, 2020


Micro-service architecture was developed with one giant caveat that almost nobody seems to have noticed: Do not migrate your system out into micro-services unless until your scale cannot support a monolithic app. As Power Nap has alluded to, if your service cannot perform a function in a single business domain without calling deeper into a nest of micro-services, your have gone too granular. You give up SO much when you move to micro-services. You give up useable stack traces (don't get me started on the work-arounds). You give up most of your ability to attach a debugger and trace through some code. You give up logging a simple series of operations unless you perform more wizardry. And at the end, most of the decoupling that you achieve is illusionary. Testing services in isolation becomes impossible and the yet more workarounds in order to achieve this never materializes.

Building services around Event Sourcing, on the other hand, has a lot of potential that flies under the radar. When you start weaning your architecture off of synchronous calls between services you can start to see some practical benefits from de-coupling. Your requesters and responders can move into different modules. Your upstream services can become truly blind to the operation of your downstream ones.

You can re-play a series of events:
This can be used for testing. This can be used to add new functionality and apply it to your existing data. This can be used for migration. This can be scale and re-partition your instances. You can run a service (or a small set) in isolation and feed it a stream of events and validate correctness without needing to be running every single service that it might interact with.
Best of all, though, it can reduce complexity and allow developers to focus on one layer of your app at a time.

The problem with working with Event Sourcing is that it requires a major shift in thinking for developers and it takes committed technical leads to work with their teams to effect that shift.
posted by WaylandSmith at 11:49 PM on November 2, 2020 [6 favorites]


lay a useful event sourcing resource on me.
posted by j_curiouser at 11:57 PM on November 2, 2020 [2 favorites]


There is no substitute to the effortful application of cognitive power to a problem

despite the combined efforts of methodology marketers everywhere to pretend for the last several decades that this isn't so.
posted by flabdablet at 1:10 AM on November 3, 2020 [3 favorites]


If people are able to create and deploy software quickly even if it isn't the way you intended and they're able to do so with a high degree of confidence, that's a sign of a good software architecture.

I'd qualify that as "well-informed confidence".

The trouble with using churning chaos as the model that underlies all the rest of the architecture is that it occasionally breeds emergent eldritch horrors about which nobody can be well-informed.
posted by flabdablet at 1:18 AM on November 3, 2020 [2 favorites]


My recollection of the first time I came across the expression "microservice" was when it was used to describe something that was small enough to throw away and rewrite in an afternoon if you got it wrong. That was it, that was the whole point. It was small enough that you could visually inspect the code for correctness; testing it should be trivial because you can just run it on some input and see that it produces the correct output (and yes, you can automate it, but you probably don't need to because it's so small); maintenance is "oh no I broke the arbitrary 100 line limit now I need two services"; and there's a limit to how many build-time dependencies you can aggregate in a script that size so builds are just that much less of a hassle.

Things have changed.
posted by regularfry at 3:01 AM on November 3, 2020 [4 favorites]


>lay a useful event sourcing resource on me.
Martin Fowler: Event Sourcing

There are a bunch of smart ideas that go with Event Sourcing that come together into more-than-the-sum-of-their parts:
*Services deal with a sequence of events and you make each component specialised to do work in response to a specific event -- that's as granular as your microservices need to be
*Work begins when components read the next event in the queue or restart reading from the queue from the highwater mark -- system state has to keep a track of how far along the event queue your system has progressed, which is helped when you also leave some capacity to keep old events for a week/fortnight in case you need to reprocess them (and your industry might need you to capture some state for longer-term compliance reasons: think in terms of the sequence of events as history itself)
*Restarting a crashed component or app or disaster recovery of a system is just resuming consumption of events from the queues
*Components only consume data that matches schemas and fail bad event data to error queues (for which we need to manage migrations between compatible schemas where they overlap)
*When components are done, they only publish data that matches a schema -- and if you're breaking that promise, it's easy to see which components are failing and to unwind the impact of their bad data output: replace the defective component and resume consuming the queues from before the bad data was created
*Immutable data: components create atomic changes by reading current-highwater-state and only writing when concrete actions are complete. This also eliminates race conditions, side effects, locking or dependency inversion and adds a bunch of performance gains
*Idempotency of changes (gold star achievement if you can architect this): running the same changes multiple times does not create multiple outputs -- one input causes only one output so that replaying (a correct input) over and again wastes time but doesn't affect subsequent event flow

Paying attention to all of these together will help manage distributed systems. It will make it easy to arrive at eventual consistency and to reason about what's wrong when your system isn't consistent.

Regarding the full article: I like that it cited Conway's Law and the 'reverse Conway maneuver' was new to me. Just saying hashtag-microservices is blind to the nuance hidden by doing it well and, like so much of the Dunning-Kruger space in computing, people are ignorant of the need to bake in this stuff early for maximal gain and to learn from other people who've done it well.
posted by k3ninho at 3:45 AM on November 3, 2020 [5 favorites]


As I recall someone saying on the internet, and i have repeated a lot in a professional capacity - microservices are like a monolith but now you have distributed partial deployments and distributed error tracing...

better be worth the hassle, but alas much of the biz is folks with "joe's shoe store.com" size problems and they want to use facebook.com size solutions
posted by thedaniel at 4:37 AM on November 3, 2020 [8 favorites]


At work I always joke that microservices were the result of some CIO hearing that there was a Single Point Of Failure in a system, and demanding that there be made Multiple Points Of Failure, instead!
posted by rum-soaked space hobo at 5:11 AM on November 3, 2020 [17 favorites]


the 'reverse Conway maneuver' was new to me.

Team Topologies, published by the folks who brought you The Phoenix Project, is a somewhat dry but fascinating read on the entire concept.

Summed up, if your software reflects your team communication structure, then you should approach your team structure with care and intentionality. And for all that is holy don't let HR drive step 0 of your software architecture by handing out team assignments.
posted by Nonsteroidal Anti-Inflammatory Drug at 5:20 AM on November 3, 2020 [6 favorites]


Summed up, if your software reflects your team communication structure, then you should approach your team structure with care and intentionality.

Your software components should be trusted, and therefore allowed to run efficiently and generally undisturbed; they should never be made to spend time repeatedly communicating things that they don't need to share just to make someone feel important.
posted by Cardinal Fang at 6:02 AM on November 3, 2020 [7 favorites]


Software is hairy. Choose your style, sure, but the hair has to be combed.
posted by skippyhacker at 6:29 AM on November 3, 2020 [5 favorites]


Granted I rarely have to deal with microservices as a design paradigm in my current role but I thought one of the key advantages for microservices was to be able to create elastic scaling of various components in a potential way of creating greater efficiencies rather than scale up large monolithic instances that might or might not be able to grow on contract dynamically or might require larger more expensive instances rather than small container based of server-less microservices. That being said I think the requirements for the web-scale companies are significant different than what exists in most enterprise environments.
posted by vuron at 6:47 AM on November 3, 2020 [1 favorite]


better be worth the hassle, but alas much of the biz is folks with "joe's shoe store.com" size problems and they want to use facebook.com size solutions
The irony is that actual Facebook.com is 100% dedicated to using a monolith and pretty happy about it.
posted by migurski at 7:24 AM on November 3, 2020 [6 favorites]


The irony is that actual Facebook.com is 100% dedicated to using a monolith and pretty happy about it.

Monolithic architecture is great if you've got the cash for it. The cost differential for large-enterprise apps is like 10X.
posted by The_Vegetables at 7:35 AM on November 3, 2020


Monolithic architecture is great if you've got the cash for it. The cost differential for large-enterprise apps is like 10X.
It sounds like you're claiming that monolithic apps cost 10✕ more to create than microservice-based apps, but that doesn't jive with my intuition or experience. Any citations you can share?
posted by ArmandoAkimbo at 8:01 AM on November 3, 2020 [2 favorites]


I thought one of the key advantages for microservices was to be able to create elastic scaling of various components in a potential way of creating greater efficiencies rather than scale up large monolithic instances

It Depends(tm). If you've got your traffic routing sorted out, you can do that sort of elastic scaling with a monolith too. You might want to say "these instances are for the homepage, those instances with their own scaling rules will serve the profile pages." Whether that makes sense depends to a large degree on what the constant factors are, but a reasonable first guess is that unexecuted code has no runtime cost. But that's a lot less shiny than saying We're Going To Rewrite Everything And Definitely Not Get It Wrong This Time.
posted by regularfry at 8:32 AM on November 3, 2020 [2 favorites]


It sounds like you're claiming that monolithic apps cost 10✕ more to create than microservice-based apps, but that doesn't jive with my intuition or experience. Any citations you can share?

No, the cost of development is pretty similar either way, but the cost of hardware to support enterprise-sized monolith apps is dramatically more expensive. Distributed computing, with smaller more discrete functions (ie: not monolithic) is much cheaper and more scalable. And if you count that data centers are basically obsolete as soon as they are built, why back yourself into that corner?
posted by The_Vegetables at 9:56 AM on November 3, 2020


> Your software components should be trusted

At some point along the journey from garage startup to major Internet cloud company that runs customer supplied code, and you want your customers to trust you, that stops being true. Don't buy into FUD that "security consultants" are selling to get you to buy their brand of snake oil, but also don't fall into trap of thinking "it could never happen here".
posted by fragmede at 10:05 AM on November 3, 2020


The best thing about being a retired software engineer is never having to listen to people pitch the latest software fad. I missed the worst of the micro services bandwagon but TBH part of why I left is I saw this disaster coming and it literally gave me crippling anxiety. My brain screamed, “this will end badly,” but there were no receptive ears for that message.
posted by sjswitzer at 11:46 AM on November 3, 2020 [7 favorites]


When someone starts talking about “compensating transactions,” you shut that shit right down.
posted by sjswitzer at 11:48 AM on November 3, 2020 [6 favorites]


When someone starts talking about “compensating transactions,” you shut that shit right down.

I favorited that one SO hard.
posted by bcd at 11:52 AM on November 3, 2020 [1 favorite]


Another dynamic at play is the desire of each team to have a thing to call its own. But the more you get away from fiefdoms and letting architecture mirror the org structure, the better off you will be.
posted by sjswitzer at 12:02 PM on November 3, 2020 [1 favorite]


lay a useful event sourcing resource on me.

I found this talk by Bobby Calderwood about event-sourcing inspiring enough that we implemented a similar system at my then-place of work. He is using Clojure and Datomic but I think he describes the general architecture and rationale quite clearly apart from the language / technology specifics.
posted by whir at 1:41 PM on November 3, 2020 [1 favorite]


Also, to counterbalance the largely sanguine view on event-sourcing in the above talk, this blog post points out some of its downsides and seems realistic to me as someone who has traveled down that path.
posted by whir at 2:47 PM on November 3, 2020 [1 favorite]


So how has no one posted this extremely accurate microservices video yet???

SO UNTIL OMEGASTAR GETS THEIR FUCKING SHIT TOGETHER WE'RE BLOCKED!

The irony is that actual Facebook.com is 100% dedicated to using a monolith and pretty happy about it.

I don't think that's true, unless you mean using a monorepo as microservices and a monorepo are not mutually exclusive, as Google demonstrates. I'm sure FB has lots of independent services like video transcoding or some other offloadable asynch processes.

"Extreme scenario of deployment-side splits only" is basically how the big G operates and honestly I think it's a good model. Coordinating builds across multiple repos? ugh no way, is there really anyone smart enough to figure out how to make that work?

But I'm not sure how microservices is even a debate anymore (and this article is not just about the notion of microservices, it's about mapping repos to deployment units). In my mind "microservices" describes having multiple small deployment units and everybody does that right? We don't have object brokers anymore thankfully.

But I mean I guess it depends on your business. I suppose there are companies out there where their product is a single deployable. Like any small SaaS company could probably work that way. Sure. But if you have one deployable, don't use multiple repos. I mean, ugh, why?
posted by GuyZero at 5:05 PM on November 3, 2020 [7 favorites]


The irony is that actual Facebook.com is 100% dedicated to using a monolith and pretty happy about it.

This isn't true, except in two narrow senses which I think both have their own flaws.

If you are conflating monorepo and monolith, then yes, sort of, except the point of the article is about not doing that, and in any case Facebook has a half-dozen repos it uses. If you mean the front end is a monolith, then I suppose, but focusing just on one part of the system to make a rhetorical point here seems a little unfair - the front end depends on hundreds of other services which are independently deployed and managed.

having a ready answer when the thinking gets tough is a soothing lie that just moves complexity about. There is no substitute to the effortful application of cognitive power to a problem.

This is true in some trivial sense, but note that the article is essentially a discussion of techniques and technique selection*, of which microservices are one. Saying "techniques are no substitute for thought" is fine, but they're not actually in opposition, any more than you're going to find a carpenter using their mind to hammer nails. Expert craftspeople are experts because they know a wide variety of techniques and they know when to apply them.
posted by inkyz at 6:35 PM on November 3, 2020 [3 favorites]


When all you have is microservices, everything looks like wet particleboard.
posted by flabdablet at 9:43 PM on November 3, 2020


When someone starts talking about “compensating transactions,” you shut that shit right down.

Can you elaborate on that? What's the pitfall here?
posted by lovelyzoo at 4:06 AM on November 4, 2020


I have to say thanks, GuyZero, that video is fantastic!
posted by spbmp at 6:43 AM on November 4, 2020


As one of the comments on the video says, it's not comedy, it's a documentary. I have had conversations that are so so much like that one. Thankfully I usually get to be the product manager-ish person.
posted by GuyZero at 9:44 AM on November 4, 2020


I'm a mobile dev, not a backender, so take that into account.

But the thing is that Ms (Microservices) have advantages. In terms of decoupling stuff, in terms of updating parts whilst running, in terms of security.

I do not want to derail this architectural topic in any way but I have found the posts the Star Citizen/RSI backend/technical staff made a while back fascinating in this regard.

And it has made me think about using those patterns in mobile architecture, too, for a while now. Which is funny, because mobile is going that direction with all the decoupled modules.

A good architecture IMO is the Hexagon/Ports and Adapters pattern: decoupling of concerns using defined interfaces.

I'm not too sure (mobile dev, not backend!), but that's kinda what all those Docker containers and Kubernetes etc allow you to do, in a fashion where things are hotswappable.

Of course, I have seen it go off the rails, too: I've seen cases where it seems almost every bloody method got containerised!

So, yeah, maybe Microservices ad absurdam are bad. Services, however? Seem like a good idea to me!
posted by MacD at 4:36 PM on November 4, 2020


I will go ahead and be That Idiot: What's a microservice?

Or is this a case of, "if you're not a programmer/web dev person, you'd get so lost trying to understand the terminology that the actual arguments would be meaningless?"
posted by ErisLordFreedom at 5:31 PM on November 4, 2020 [1 favorite]


Or is this a case of, "if you're not a programmer/web dev person, you'd get so lost trying to understand the terminology that the actual arguments would be meaningless?"

Think of Facebook and all you can do on it: post on your timeline, browse photos, message people, etc. Now think of how to build things. If you're not a programmer it might not make sense but think of it very loosely like building a company. You can have one person make the product, do the billing, do the shipping and do the marketing or you could break it up. I don't care about the product I just care about shipping it, or I don't care about how it is built or how it is shipped I just care about what ti does so I can market it.

Same idea, a micro service means, again very broadly, you can have a service that just does everything related to timeline and one that just does messenger. Or in Microsoft Office you don't have one application that tries to do everything but you have Word, Excel and Powerpoint.

This works really well for certain problem sets and web development in particular. If the messenger application needs a lot more resources you can say it needs to scale and you can scale that independently of the timeline application. You can add 1,000 more engineers and are generally satisfied it won't impact timeline or add another data center and it won't impact timeline.

It corresponds to the Unix philosophy of do one thing and do one thing well.

Now what has happened is that you get into a meta-argument about the nature of microservice and when to break it up. I think what others are arguing is that if you design software correctly you don't need to really think about breaking it up at all, it should naturally evolve into a microservice if developed properly. I haven't experienced this personally but it seems a lot of people are developing microservices for the sake of microservices and creating a system of things that instead of being independently robust are confusingly interconnected and fragile while seeming robust. Worst of both worlds.
posted by geoff. at 6:18 PM on November 4, 2020 [4 favorites]


I'd love to see a worked example of Event Sourcing, because from where I sit it looks crazypants, and folks have actually worked with it.

First, I like workflow systems, I like publish/subscribe queues. I'm taking the Martin Fowler piece as not just putting a name on that but talking about something different, and finer-grained, from his "every change to the state of an application is captured in an event object."

Not every memory mutation surely, so let's say I take this to the level of RPCs. Basically this means 1) I'm logging full RPC request payloads to durable storage, and 2) I'm functionally broadcasting RPCs to all listeners, and if I get the model an RPC has void return, instead I listen for a broadcast response from my expected counterparty.

First off, in my world, RPC request payload bandwidth is very high. Just getting it to local storage can bog things down. (Often NIC read is faster than SSD write.) Getting that to durable replicated storage suitable for auditing and probably you want globally consistent time ordering, yikes. And after that, the cost to keep it -- the business value of my work items doesn't justify SSD(/disk)-years to store the requests involved. OK, I guess these are not driving factors where this approach is being considered.

Okay, if you can afford it, then as software architecture: I can see that if you have a compelling need for the replay and/or broadcast behavior, then you would build around that, but it looks like a hard way to live. If you don't need replay/broadcast, you just want logging, logging is great, but just log your RPC wire payloads?

Replay: can be valuable. But to pay for this you are taking on an unbounded version compatibility window, yeah? Services talking RPC can be different versions, but usually you can enforce that they're not wildly different. Stored data is forever. If you want to replay old events, you've got to keep support for old events. Future-friendly design of an RPC interface is already hard. Invented rule of thumb I'll pretend Fred Brooks said: future-friendly design of a value to be stored forever is 10x harder.

Broadcast: I just don't know. Haven't tried it. The integration problem *seems* hard. Either you interact in a 'nominal acquaintance' style where you don't talk to parties you haven't identified as known, or in a 'structural acquaintance' style where you'll listen to anybody who talks the right way. With the former the benefit of it all is unclear to me. With the latter... I dunno, integration testing with known peers too hard for me already.
posted by away for regrooving at 12:51 AM on November 5, 2020 [1 favorite]


If you are conflating monorepo and monolith, then yes, sort of, except the point of the article is about not doing that, and in any case Facebook has a half-dozen repos it uses. If you mean the front end is a monolith, then I suppose, but focusing just on one part of the system to make a rhetorical point here seems a little unfair - the front end depends on hundreds of other services which are independently deployed and managed.
“Microservices” has a distinct meaning from services generally that you'e confusing. Any system with complex requirements will have differently-shaped pieces for subsets of tasks: a web tier for serving end-user requests quickly, some kind of job queue for asynchronous or scheduled things, hosts with particular hardware for the vector & matrix operations needed for AI/ML, etc. Everyone does this. Microservices are something very different: splitting services early and often across a network and traversing those splits in the synchronous path with HTTP sub-requests are some defining characteristics. Facebook does not do this.
posted by migurski at 7:41 AM on November 5, 2020


I'd love to see a worked example of Event Sourcing, because from where I sit it looks crazypants, and folks have actually worked with it.

I know of HFT and other trading desks using Event Sourcing from ~2010 because they inherently included auditing. So I don't know what you do specifically but they were definitely in use with high volume, low latency environments. I'm imagine in such scenarios space isn't a concern.

Usually events aren't kept forever but for only a couple of months. I have seen events kept forever that lead to really cool use cases. If you could replay everything a user does in app and want to add analytics it is sometimes easier to just do that.

As far as crazypants goes, I guess it is like micro-services in that it is really powerful and can be abused. Like I think of "curl" as a general example of a great micro-service type program. If it were split up into curl-ftp, curl-http, curl-https that'd be abuse of the micro-service architecture. So:

user_signed_up
user_changed_email_address
user_reset_password

Might all be valid, in fact if you did it right, really user setting an email address for the first time and a user changing one is kinda the same function so you only need to write that functionality once. Similar to a method call. But if I wanted to hook in to user_signed_up and send an email, and later send them a text message or whatever I just need to listen for that event. I don't care if they sign up from a kiosk or website. Again, while event sourcing doesn't mean a message bus, usually the case is some sort of messaging infrastructure around it. Though a website or Javascript is pretty event driven with onClick and async calls.

It does force you to think async for everything but that's not necessarily a bad thing and actually a good a paradigm to think in.

In short the issues you raised I've never seen. In my mind even extremely high volume things like:

user_deducted_money
user_added_money

Would need something resembling event sourcing or auditing around that.
posted by geoff. at 12:21 PM on November 5, 2020


@geoff. Thanks for the breakout. If my naive calculation works, Nasdaq runs 20M - 30M trades/day, so a few hundred trades/sec (really so low?) and even big HFT players must be doing like 10 trades/sec. Of course, the analysis to find favorable moves must be higher-qps, but I can't being to estimate from here. If it's that kind of volume, then yeah, no worries about saturating an SSD write bandwidth.

In an audit-intensive environment -- where maybe you want to know there's no way you forgot a log write, the execution is the log -- I can definitely see the appeal. And if you have a high money-per-computation ratio so you can throw money at a fast durable consistent storage system.

Interesting, traders' audit records only go back a few months? No idea how long a window the SEC has to come calling, but I know of SOX records getting kept a long time, and I assume trading has to be squarely in SOX scope.
posted by away for regrooving at 11:12 PM on November 5, 2020


(Er, the Nasdaq trading day isn't 24 hours long so throw another constant factor on the pile.)
posted by away for regrooving at 11:14 PM on November 5, 2020


« Older the big little migration   |   "I'm worried about The Future..." Newer »


This thread has been archived and is closed to new comments