DevOps: A History
Nell Shamrell-Harrington, a Software Development Engineer at Chef, gave a wonderful talk on the history of DevOps at a recent conference, and was kind enough to share her deck. Some of the highlights:
While DevOps has recently become just another marketing buzzword in many circles, it had a much more meaningful and humble origin. Nell points out that scale was at the very root of what led to DevOps, and provides a great 18th/19th-century analogy. In 1785, Honore Blanc introduced interchangeable gun parts. Previously, guns had been made pretty much entirely by hand, making each one a unique creation. Even if a bunch were made along the same lines, guns were like trees – markedly similar, but different. Now, they could all be the same. Parts became modular entities. A couple (heh) of years later, in 1908, Ford introduced his famous Model T – which was famous for being made on a mechanized production line. Suddenly, Nell points out, workers were interchangeable, too. This interchangeability brought great scale to these industries, enabling more final units to be produced than had ever before been possible. The industries had scaled. And they had, to a degree commoditized, enabling (in Ford’s words) “…the best commodity [to be] produced in sufficient quantity and at the least cost….” Part of this industrialization was reducing the costs of these items, while enabling more of them to be made.
This industrialization didn’t put a net number of people out of work. Instead, more people with greater skills were required to create the mechanized industrial lines, the part-milling machines, and so on. So the overall skill level – and pay – of those in the industry went up a bit.
Nell quotes a statistic from “The Visible Ops Handbook” that I’d not caught before, and it’s a good one: “High performers” in the industry have a ratio of around 100 servers to 1 admin; average companies run between 15:1 and 25:1. That means even a high-average company needs 4x the labor to run their environment as a high performer. That’s a lot – and it’s where DevOps had its genesis. The mechanized automobile production line, and standardized gun parts, were responses to a problem of scale. Ford could produce dozens of times as many cars, with the same labor force, as his hand-built competitors. He produced more quantity at a lower cost. DevOps is essentially the mechanized production line of IT Operations – although it’s not just about automation. Standardization, to a degree, is important, as are other factors.
Standardization is something IT Ops ha more than a passing familiarity with. We’re famous, for example, for trying to lock down our client configurations so we only have to learn to support the smallest number of scenarios possible. Ford did that, too: he produced a limited number of models, and painted ’em all black. Reducing variety means reducing variables, which means increasing scale more easily and cheaply. But in IT, I personally feel we’ve misinterpreted that approach a bit. I don’t think it means you need to standardize on Linux, or Windows, or one version of those, per se. Instead, I think it means a twofold approach. First, eliminate variables as much as possible. I don’t mean standardize variables, per se – I mean remove them as a factor. Take containers, for example. Their whole point is to abstract away much of the underlying operating system, so the OS matters less. You haven’t necessarily standardized on an OS, you’ve simply ceased to care about it, to a point. Second, for those models (configurations) you do offer, standardize those, and do so in a way that prevents them from drifting from that standard. That’s what Chef, Puppet, DSC, and all those other configuration-management technologies are all about.
Nell then throws around some cool Japanese words and concepts, starting with Lean. It’s a term originally applied to Japan’s automobile manufacturing industry. The age-old problem with mass production is that you can produce a huge quantity of a few things, but consumers tend to want more variety. In the IT world, it’s the same thing: we can standardize, but that reduces the variety our business may well benefit from. Lean was Japan’s answer to mass-producing variety. There’s a great article on this that you should totally read, but here’s the synopsis: Japan figured out how to produce low volumes of product at mass-production efficiencies. Or, in IT terms, how to have the benefits of standardization along with a lot of variety.
Nell’s next word is jidoka, one of the two pillars of the Toyota Production System (just-in-time is the other). Jidoka highlights problems in a system, because when a problem occurs, everything stops. So you tend to build processes that encourage quality from the outset, which helps eliminate the root cause of defects. It’s “automation, with human intelligence.” Imagine a loom, weaving cloth, that could stop magically if a thread broke, rather than churning out piles of defective cloth. Now imagine that Sakichi Toyoda, founder of Toyota, invented such a loom – because he did. One operator could now control many such looms, without the need to manually oversee them all… and now you can probably start to see the IT applicability. They went from one operator per 2-3 looms to one operator per 100 looms… 2:1 to 100:1. Better scale.
Nell’s deck has a bit of a tangent here, and it’s a good one: don’t raise alerts that don’t require an immediate human action. If the systems can deal with it, let them, and trust them; if they can’t, call in a human. But you can’t just have an operational environment spewing alerts and informationals all the time, because it requires too much human attention with too little human action.
Nell’s last Japanese word: Kaizen, the concept that efficiency, processes, and so on should be under continuous improvement. And this really digs into the heart of what was wrong with IT at one point: we tended to ship systems as if they were going to be static and everlasting. But they’re not. Systems that fail to evolve are implicit failures. Continuous improvement and adaptation is the only way a business can thrive. Yet many companies still rely on carved-in-stone “gold master” images for servers and clients, and put off any kind of re-engineering or upgrading as long as possible – often way past the point of extreme pain.
Just-in-time is another favorite Japanese manufacturing philosophy, meaning you get only what you need, when you needed, and only in the amount you need. This really applies to the software development industry. Rather than trying to design, build, test, and ship the Super App, you build just enough. At Pluralsight, our Chief Product Officer calls it “shipping a skateboard… after which we’ll ship a bike, and then a car, and then a plane.” You build just enough to do something useful, and then kaizen, or continue with the next iteration, right away.
This should start to sound like more-agile software development, which was a response to the popular “Waterfall” development model of the 1980s and 1990s. Waterfall tried to build the Super App, and most projects of any size failed in some way. Scrum was the reaction to that, introducing a more agile system that simply tried to ship something usable in a very short “sprint” of time. Agile itself, as a named philosophy and approach, came afterwards, with a goal of continually delivering useful software.
And then Ops got in the way. Developers might have gotten all agile and stuff, but Ops was still stuck in a very Waterfall-style world. Releases had to be meticulously planned, typically after some over-the-wall handoff from Dev. Integration would occur as close as possible to the point where the CEO was about ready to fire someone, leaving no time to fix problems if they cropped up. Ops would blame problems on the code, Dev would blame problems on the infrastructure. Both were wrong, and they were led astray by a culture of blame that’s still all too common.
You cannot have DevOps with a culture of blame. They are not compatible. DevOps requires a culture that acknowledges the inevitability of eventual failure, and a commitment to fixing things.
And so now you have a little bit of the history behind DevOps, and some manufacturing-world analogies to help us understand how we got here, and where we’re going. And if you’re thinking to yourself, “well, my company’s culture could never survive this,” you may be right. You may be at the IT equivalent of GM, Ford, or Chrysler in the 1970s, when Toyota and other Japanese manufacturers killed them. And if that’s the case, you may want to wonder why you’re working for a potential loser, rather than finding a winner.