Site Reliability Management with Mike Hiraga

Software engineers have interacted with operations teams since the software was being written. In the 1990s, most operations teams worked with physical infrastructure. They made sure that servers were provisioned correctly and installed with the proper software. When software engineers shipped bad code that took down a software company, the operations teams had to help recover the system—which often meant dealing with the physical servers.

During the 90s and early 2000s, these operations engineers were often called “sysadmins,” “database admins” (if they worked on databases), or “infrastructure engineers.” Over the last decade, virtualization has led to many more logical servers across a company. Cloud computing has made infrastructure remote and programmable.

The progression of infrastructure led to a change in how operations engineers work. Since infrastructure can be interacted with through code, operations engineers are now writing a lot more code.

The “DevOps” movement can be seen through this lens. Operations teams were now writing software—and this meant that software engineers could now work on operations. Both software engineers and operators could create deployment pipelines, monitor application health, and improve the system scalability—all through written code.

Site reliability engineering (or SRE) is a newer point along the evolutionary timeline of operations. Web applications can be unstable sometimes, and SRE is focused on making a site work more reliably. This is especially important for a company that makes business applications that other companies rely on.

Mike Hiraga is the head of site reliability engineering at Atlassian. Atlassian makes several products that many businesses rely on—such as JIRA, Confluence, HipChat, and Bitbucket. Since the infrastructure is at a massive scale, Mike has a broad set of experiences from his work managing SRE at Atlassian.

One particularly interesting topic is Atlassian's migration to the cloud. Atlassian was started in 2002, before the cloud was widely used, and they have more recently made a push to move applications into the cloud.



