For experienced DevOps people some of the choices will be painful
June 15, 2022 2:44 AM   Subscribe

It can't be any simpler than this: you have some bash script that you want to be executed from time to time. In my case, I have some background processes I need to kill. Nomad is a well-known workload orchestrator. I have decided to automate my homelab cluster using it. I will through this blog post try to walk you through some discoveries I made on the way during the previous couple of months.
posted by geoff. (31 comments total) 12 users marked this as a favorite
 
Nothing makes me feel more "how do you do, fellow developers?" than reading about DevOps tooling. I can barely distinguish earnest, helpful articles like this one from satire.
posted by jedicus at 8:35 AM on June 15, 2022 [8 favorites]


To be clear, I'm Steve Buscemi's character in that scenario.
posted by jedicus at 9:02 AM on June 15, 2022 [3 favorites]


I sometimes wonder how many people don't quite realize that all these obscure terms, like cron or rsync or bindiff or curl or jq and so forth, aren't just weird descriptive shibboleths that outline the general shape of a problem you might have but actual, existing, proven tools that exist to solve those specific problems.

I mean, this has to be satire, but to the author's credit I actually laughed. "It can't be any simpler than this: you have some bash script that you want to be executed from time to time."

What could be simpler than using nomad to run a cron job, you ask? My friend, you're going to lose your damn mind when you find out about ... cron.

"Obviously, the Tailscale agent is installed and fully operational at this point, but I used Ansible to set up that part before I even tried running any Nomad job on the machine."


I mean, sure, obviously. For installing a drawing program. Who wouldn't want that?
posted by mhoye at 9:18 AM on June 15, 2022 [7 favorites]


That reminds me of the old classic about SOAP.

Microservices and complex containerized deployments are the only way forward when you need them, but good lord--everyone deploys everything like this nowadays, and if I ask them why they chose a microservice architecture for something that could've been deployed as a handful of larger services in autoscaling EC2 instances, I get blank looks and expect them to answer, "But microservices go to eleven."
posted by Ickster at 9:21 AM on June 15, 2022 [5 favorites]


I mean, this has to be satire, but to the author's credit I actually laughed.

I noped out at the very start and didn't get far enough to recognize it as satire. But I'm also 114% burned out on what the kids are doing with DevOps these days. In my day we just called it "automated deployment" and we wrote all our glue in Perl and wore onions on our belts. For certain values of onion, anyway.
posted by fedward at 9:46 AM on June 15, 2022 [8 favorites]


I noped out at the very start and didn't get far enough to recognize it as satire.

So much of the devops toolbox is right on the razor's edge of engineering self-parody anyway now that it's hard to ever know for sure.

I actually blame Unix and Apache for a lot of this. Moxie Marlinspike keeps saying that nobody wants to run their own server, but that's just not true. People run their own servers all the time, all day every day. Phones, cars thermostats, televisions, baby monitors, people run the hell out of their own servers.

What people don't want to do is "Deal with Unix/Apache", because dealing with Unix and Apache is such a shitshow that Amazon can make one of the biggest businesses in the world selling you CPU cycles and packet transmission at a 10,000% markup, and companies like Heroku can sell you services on top of that another 1000% markup, because everyone says they want to scale and this is the way they need to scale but what people really want is to not have to deal with Unix or Apache.

And ... look, it's hard to blame them.
posted by mhoye at 10:18 AM on June 15, 2022 [8 favorites]


This is my world. I became a DevOps lead/architect/director/whatever when folks realized that I understood both Dev and Ops worlds and could translate the languages and needs between the two. (This is where being a long term UNIX geek comes in handy)

But yes, so much of DevOps stuff is "let's take all of this stuff and put it in a web gui and interconnect via other protocols". Handy in a lot of ways, but good lord does it make it hard to understand what's happening at the core.
posted by drewbage1847 at 10:26 AM on June 15, 2022 [3 favorites]


The emergence of Development Operations as a distinct discipline that incorporates some of the lessons learned from "traditional" software engineering while keeping developers as far away from production deployments as possible is a good and important and necessary development. This does not mean we have to immediately adopt whatever new hotness the Amazons and Hashicorps are throwing at us today until we understand how it makes things better.

Where things sometimes go wrong is when decision-makers look at "makes things better" from either a developer-only or DevOps-only perspective. Sometimes the additional complexity of the tooling helps and sometimes it's overkill for the use case, which not only adds a bunch of unnecessary work for the DevOps engineers but can also needlessly complicate the design of the software itself.

For example, I genuinely believe there are use cases and workloads where the complexity of a cloud-hosted Nomad or Kubernetes setup is justified, and I've had conversations with engineers I trust who are working on those kinds of projects. But so many companies and teams, including my own, are working on projects where simply using the cloud provider's own services for orchestration and monitoring is perfectly acceptable, and a lot better when you consider the costs involved in having staff who are capable of keeping a sophisticated cluster setup running.

Now, at some point, does a project in the latter category end up becoming successful and requiring more flexibility? Sure, but if you're doing your job designing the components of the system, you can adapt to a change in the production architecture without having to rewrite everything from the ground up. Premature optimization is just as big of a problem on the DevOps side as it is for developers.
posted by tonycpsu at 10:26 AM on June 15, 2022 [1 favorite]


Oh and never forget that for a number of management types DevOps sounds like "ooh, we can have Devs do Ops work and save money by eliminating positions" which is absolutely barmy. The two fields have radically different priorities - devs make new things that break and ops make things not break
posted by drewbage1847 at 10:29 AM on June 15, 2022 [3 favorites]


"devs make new things that break and ops make things not break"

Speaking of satire.
posted by 3.2.3 at 11:01 AM on June 15, 2022 [2 favorites]


I didn't mean it as satire. Where I came up as a dev it was in advanced projects and worked hand in hand with network and broadcast operations engineers. They thought the devs were always going to deliver product that was going to break communications, broadcast black, get them paged out, etc. The dev teams always thought the ops gang was holding them back and preventing progress.
posted by drewbage1847 at 11:13 AM on June 15, 2022 [3 favorites]


The two fields have radically different priorities - devs make new things that break and ops make things not break

Well, that was the original core insight of the field, and it had very little to do with the tech stacks involved and everything to do with aligning incentives. It turns out people take a very different approach to writing software when they are going to be the person getting paged at Oh My God O’clock when it falls over.
M
In fact, to a certain view that’s what a lot of these services and stacks are for: playing Pager Hot Potato, to make sure you’re not holding it when it goes off.
posted by mhoye at 11:43 AM on June 15, 2022 [3 favorites]


I likewise noped out pretty early on due to pain avoidance. I was warned but assumed something about that warning that wasn't true.

DevOps, as a culture and practice, seems to be focusing on either building integration or building abstraction. I'm all for a little abstraction when it actually enables improvements to people's lives but as someone well along a journey to Ancient Greybeard status I question the value of what often looks like abstraction for abstraction's sake. There is a lot of that going on in software and infrastructure.

Plugging one thing into another, though? That's automation-enabling work and frequently results in measurable improvement to someone somewhere.
posted by majick at 12:11 PM on June 15, 2022 [1 favorite]


mhoye: ....playing Pager Hot Potato, to make sure you’re not holding it when it goes off.

System administration is all about moving resource bottlenecks...preferably right into someone else's hands.
posted by wenestvedt at 12:36 PM on June 15, 2022 [2 favorites]


This article was satire in 2015, it's satire in 2022, and I can verify that it's aged very well.
posted by morspin at 1:47 PM on June 15, 2022


Yeah I'm in the biz and the complexity, designed to make things more resilient, often seems to create a fragility of its own that is hard to justify when you aren't at a very large scale.
posted by i_am_joe's_spleen at 2:51 PM on June 15, 2022 [1 favorite]


Like, when you need it, you need it, but that's not nearly as often as some people seem to think.
posted by i_am_joe's_spleen at 2:52 PM on June 15, 2022


Yeah, preparing for an easy migration to the next scale is often better planning than preparing to scale per se.
posted by mhoye at 3:11 PM on June 15, 2022


You don't even need cron if you know where it's at.

What bafflegab do I need to spout to get a highly paid admin job to not know the contents of the Unix and Linux System Administration Handbook?
posted by snuffleupagus at 3:35 PM on June 15, 2022 [1 favorite]


You could probably start with the free and pretty comprehensive AWS courses Amazon provides, and then start subscribing to the updates from different high profile, low level utilities like venerable SQLite, up and coming stalwarts like jq and new contenders like tomnomnom so you can ape the low level perf terminology, after that you should be off to the races.

One of the nice things about this field is that you can absolutely always tell when the people who paid attention in the compiler, functional programming and theory courses show up - they’re the people who solve problems by causing them to stop existing rather than turning twiddling the knobs on S2 - and they can draw the straight line from how you’re ordering the bits in the data structure down here through why your DB is struggling with some modest load to why your EC2 instance isn’t falling over twice a week and costs 20x what you’ve budgeted, and most of the terminologies you need to be able to have a conversation with them are in there somewhere.
posted by mhoye at 4:10 PM on June 15, 2022 [2 favorites]


Sorry, that should have read “tomnomnom’s gron”.
posted by mhoye at 4:46 PM on June 15, 2022


I've read Spring Boot stack traces top to bottom and so I don't fear a few dozen layers of questionable devops abstraction anymore -- and not because I know it can't hurt me, but because the receptors in my brain that would register fear and loathing have been burned out by Spring.

Now my workself dwells in the emotional void that lies beyond numb.

this is kind of like why Nickleback was no big deal to me. After Matchbox 20 rock and roll was doomed anyway.
posted by Sauce Trough at 6:12 PM on June 15, 2022 [4 favorites]


My generous read is that they were trying to learn the tooling to learn the tooling and really used it for silly use cases. Sure cron is old and everywhere, but when learning tooling, you sometimes do things that aren't that useful just to learn the mechanics of the toolchain.
posted by advicepig at 5:47 AM on June 16, 2022


...no really, this is just a reskinning of Mornington Crescent, right?

RIGHT?
posted by hearthpig at 5:54 AM on June 16, 2022 [2 favorites]


Sorry, that should have read “tomnomnom’s gron”.

that's ok, I always skip the Bombadil parts too
posted by snuffleupagus at 6:58 AM on June 16, 2022 [3 favorites]


Let me related a, um, related story From Elsewhere to you:

13:43 [friend] That reminds me of something I just finished this Monday. Our shop is migrating from on-prem Bitbucket Server to the cloudy version of GitHub Enterprise, and en route we're opening the gates to folks using GitHub Actions in addition to our moribund on-prem Jenkins dustbowl-era build farm.
13:45 [friend] Challenge: GitHub Actions are this_is_fine.jpeg from a feature perspective, but when you start using it as an orchestration tool they kind of stomp off into a rake-filled field of their own design
13:46 [friend] Long story short, sometimes you want to add extra permissions to a GHA job for it to act on repos outside of the one which launched the job, and to do that the least awful way is to create a narrowly scoped GitHub App with the necessary permissions attached
13:47 [friend] But to _use_ that permissions scheme, which is GitHub's recommended approach since about 2019, you have to... pull down a reusable GitHub Action that is only supported by an ex-employee who last touched it about a year ago
13:47 [friend] Which: OK, that's not necessarily _terrible_
13:48 [friend] But then you look at the code and discover that it's pulling down un-version-pinned TypeScript and NPM dependencies and you just don't want to deal with the supply chain risk of having all your ephemeral but highly permissioned access tokens exfiltrated the next time a package maintainer lets their domain lapse and loses control of their email address
13:49 [friend] Turns out rewriting 30+kb of TypeScript into two curl calls bookended by openssl and jq doesn't just give you a zero-external-dependencies JWT solution with about ten lines of code, it also turns a 45-second-plus-network-auth-delay step into a 90ms-plus-network-auth-delay step
13:50 [friend] who knew that resolving dependencies with network calls at startup time carried a time cost, hmm, hmmmmm
13:56 [friend] the real cherry on top of this story: the split between the dev repo and the test repo because of those two teams not getting along? They get along fine these days. All the bad blood was between people who are no longer with the company.
Anyway, I am regrettably obligated to inform you, but as you are no doubt already aware: people.
posted by mhoye at 7:06 AM on June 16, 2022 [6 favorites]


a reusable GitHub Action that is only supported by an ex-employee who last touched it about a year ago

At one company our deployments were an unholy mix of Jenkins and Chef. The Jenkins stuff was basically the same in every instance and Chef took care of all the "the same, but in production" needs, to match the way the business worked and understood its compliance requirements. This setup meant that developers had clean hands on deployments to production, but it did have a certain fragility. At one point during our move to AWS the devops people actually had to get developer help rewriting the Chef recipes and I remember the whole thing being a bit of a "these aren't the droids you're looking for" compliance handwave. The developers were hands-off on deployments once they'd helped write the automations, see.

That company was also stuck with mercurial for way too long because of a compliance handwave in the other direction, but that's a different story.

But then you look at the code and discover that it's pulling down un-version-pinned TypeScript and NPM dependencies

I've got a friend here who's a big cheerleader for SBOM and I imagine all the little aneurisms he must have every single day when he encounters such deployments.
posted by fedward at 8:59 AM on June 16, 2022 [1 favorite]


MetaFilter: a rake-filled field of their own design.
posted by wenestvedt at 10:22 AM on June 16, 2022 [2 favorites]


Metafilter: a rake-filled field of their own design.

Well, in the case of GitHub Actions, that's a fair assessment. Everyone thinks they're making a function call, nobody realizes they're taking on a dependency.
posted by mhoye at 12:46 PM on June 16, 2022 [1 favorite]


"rewriting 30+kb of TypeScript into two curl calls"

I am realizing just now that this kind of activity is not unique to my Aging Bull in a China Shop style (c.f. comment about needless abstraction). It is invigorating.
posted by majick at 3:38 PM on June 17, 2022 [1 favorite]


A lot of modern DevOps kind of reminds me of recipes from late 1940s/1950s cookbooks.

People had access to lots of foods that their parents didn't have - and copious quantities of the foods! Plus they had all these appliances that their parents didn't have! Jello, refrigeration, OMG OMG OMG! All these things separately are so great, so timesaving, so wonderful! And they put them all together in all sorts of different combinations! And while occasionally a genuinely good dish would result, often enough they'd just end up with half a banana poking out of the middle of a pineapple ring, with whipped cream around the base and a cherry on top. (And that would be the palateable version, because there were recipes that tried to treat this as a savory dish rather than sweet. Think "hotdog" instead of "banana" but everything else was the same.)
posted by Tailkinker to-Ennien at 3:22 PM on June 22, 2022 [1 favorite]


« Older Return of TISM   |   "'Admittance Is Not The Same As Acceptance':... Newer »


This thread has been archived and is closed to new comments