DevOps: Fundamental Answers

What is DevOps?

DevOps was an unclear term at the beginning of the week.

Depending on who you ask, DevOps is:

  • the agile manifesto applied to sysadmin
  • the applied version of The Lean Enterprise
  • the people who manage tools like Jenkins and containers

Does the DevOps role actually exist?

DevOps is a loose cultural thing, like agile. “DevOps” engineers are often software engineers or operations people that have been rebranded with minor change in responsibilities.

Bryan Cantrill said that a DevOps person is either an engineer with extra empathy for the ops team, or an ops person with extra empathy for the engineers.

This is a semantic discussion and what matters more is what occurs in practice.

What does it mean for a company to “implement DevOps” where it previously did not have DevOps in its vocabulary?

In these conversations, I tried to focus on the tools used by these “DevOps teams” or “DevOps organizations”. Continuous delivery software and containers kept coming up.

These tools are the software manifestations of DevOps. They allow for fast, comfortable deployment workflows.

In some cases, DevOps are a front-lines, reliability engineering role.

Financial trading companies with a DevOps team will often sit them between the traders and the engineers. In this case, DevOps are responsible for handling ad hoc software problems faced by the traders, so that the developers can continue working on new features.

If an organization adopts DevOps, what does that mean for software engineers?

James Turnbull added that DevOps is about “getting to the pub on time” and having a better place to work.

I had a conversation with a DevOps engineer before this week began, and asked him what topics I should cover. He suggested availability and reliability in distributed systems. Distributed systems can lead to complex failure modes that can impede engineers from getting to the pub on time.

Another interpretation is that DevOps is about breaking down silos or “walls of confusion” between development and ops.

What are “immutable infrastructure” and “infrastructure as code”?

From a HighOps Q&A:

1) What’s an Immutable Infrastructure?

A pattern or strategy for managing services in which infrastructure is divided into “data” and “everything else”. “Everything else” components are replaced at every deployment, with changes made only by modifying a versioned definition, rather than being updated in-place.

2) What’s your position on Immutable Infrastructure and why?

Definitely a good idea that all components of a running system should be in a known state. Worth exploring as long as it doesn’t become yet another silver bullet. We still have a lot to learn about how to make it work well and really need to analyse the problem we are trying to solve, which is minimising errors in deployment, configuration, or runtime.

On one hand it’s a myth, because it ignores the actual operating conditions of systems, because servers and network devices are changing all the time (e.g. RAM) but seen as one point on a continuum or plane of configuration options, it can be useful because incremental configuration management tools can never control everything; in particular it’s impossible to be certain that things are absent.

From ThoughtWorks (emphasis mine):

Infrastructure as code, or programmable infrastructure, means writing code (which can be done using a high level language or any descriptive language) to manage configurations and automate provisioning of infrastructure in addition to deployments. This is not simply writing scripts, but involves using tested and proven software development practices that are already being used in application development. For example: version control, testing, small deployments, use of design patterns etc. In short, this means you write code to provision and manage your server, in addition to automating processes.

It differs from infrastructure automation, which just involves replicating steps multiple times and reproducing them on several servers.

With this, the knowledge of server provisioning, configuration management and deployment is no longer only with the systems admins. Even developers can easily engage in the activities, because they can easily write infrastructure code in the languages that they are familiar with. In addition to this, the learning curve for most descriptive languages used by tools like ansible is not very steep. This makes devops even simpler for a developer.

What precipitated the movement from service-oriented architecture to microservices?

From a SED post about Microservices vs. SOA:

  • Service-oriented architecture (SOA): an architectural pattern in computer software design in which application components provide services to other components via a communications protocol, typically over a network.
  • Microservicesa software architecture style in which complex applications are composed of small, independent processes communicating with each other using language-agnostic APIs

Martin Fowler has much more to say.

Microservice architectures are more appealing with containerization software because the container is a small mimic for a server. With thicker servers, we tend towards thicker services. With containers, we tend towards thinner services.

Why are containers so important to DevOps?

Dr. Dobbs writes (emphasis mine):

Containers provide a lightweight alternative to virtual machines and they enable developers to work with identical dev environments and stacks. They also facilitate DevOps by encouraging the use of stateless designs

Once we have stateless applications, we have more flexibility. In particular, we start to think about how we can get “just what’s needed” to underpin our application. In doing so, we can create very small dependency chains, which has several advantages:

  • Less effort to maintain: With the smallest number of dependency updates, there is less to patch and we are less likely to introduce new bugs/patch breaks
  • Lower infrastructure requirements
  • More deployment options, as in virtual machine (VM), physical, or cloud
  • The smallest attack surface possible for hackers to use

We now have stateless development, containerization, and the complementary use of VMs, but what does any of this have to do with DevOps? Note that DevOps is a lot of things to a lot of people. For our purposes, the most important aspect is boosting collaboration between developers and operations for application deployment.

While not normally a combative relationship, friction between the parties can occur when operations staff tries to check the “crazy things” that developers want to deploy. As an old, probably bad, truism of mine goes, “Ops would be happy to never change anything, QA likes a few new features to test, developers want bleeding edge toys every week, and the business wants new stuff now.”

DevOps eases this tension by providing another abstraction layer. If developers are responsible for their applications in production (where “application” means the source code and its dependencies above the OS), then operations people are free to focus on making sure that there is appropriate computing power, fail-over, geographic presence, and everything else that symbolize a robust, modern, operational environment. This partition of responsibilities helps when the people who create an application are the ones supporting it, things tend to run more smoothly: When a developer gets a bug call at two in the morning for a couple days running, that bug gets fixed pretty quickly. Operations people are happier because they can now focus on automating their environment, not responding to application outages that they often don’t have the knowledge or skills to fix.

Why is Docker so important to containers?

Docker dramatically improved usability of containers through its friendly API.

This (and other things) led to network effects and virality:

  1. A developer can figure out what Docker does, install it and do something useful with it in 15 minutes. I first heard this “rule” from Marten Mickos when talking about why MySQL was so successful: low friction to try it out, a simple concept and useful functionality.
  2. Docker is a great name and it has a cute logo. It resonates with what the product does and is easy to remember. Engineering-oriented founders sometimes seem to think that names and logos don’t matter if the product is good enough, but a great name can turbocharge adoption and build a valuable brand.
  3. The Docker product came from a non-threatening source, a small startup (DotCloud) that was able to broadly partner across the whole industry. If the same product had come from an established enterprise technology player, there would have been much more push-back from that player’s competitors, and the market would probably have split into several competing technologies.

Does DevOps have an application to Docker’s use on a Spark/MapReduce analytics platform*?

Excluding the word DevOps, Pachyderm is a company built around this:

Pachyderm File System (pfs)

Pfs is a copy-on-write distributed file system built to deploy on containerized infrastructure.

Version controlled data

Pfs is a commit-based distributed file system that offers complete version control for your data. Pfs lets you take space-efficient snapshots of your entire cluster so you can track how your data has changed over time and instantly revert back to any previous state. This makes pfs ideally suited for large data sets that change over time — such as production database dumps or log files.

Isolated development environments.

Pfs supports branching and merging, just like your VCS tools for code, but on your entire data set! Just give each developer or data scientist a personal branch and it’ll feel like they’ve got the cluster all to themselves. They can manipulate data and develop analysis in complete isolation and then merge it into the main line when completed.

Easy dashboarding

In Pachyderm, all data is accessible via HTTP so you can serve dashboards directly out of the file system. You can even have the results of an analytics pipeline be a dashboard that automatically updates every time new data is committed.

James Turnbull discussed this some in our interview.

*This question is a buzzword whirlwind.