Observability Engineering with James Burns


Podsheets is our open source set of tools for managing podcasts and podcast businesses

New version of Software Daily, our app and ad-free subscription service

FindCollabs is hiring a React developer

FindCollabs Hackathon #1 has ended! Congrats to ARhythm, Kitspace, and Rivaly for winning 1st, 2nd, and 3rd place ($4,000, $1000, and a set of SE Daily hoodies, respectively). The most valuable feedback award and the most helpful community member award both go to Vynce Montgomery, who will receive both the SE Daily Towel and the SE Daily Old School Bucket Hat

Twilio is a communications infrastructure company with thousands of internal services and thousands of request per second. Each request generates logs, metrics, and distributed traces which can be used to troubleshoot failures and improve latency.

Since Twilio is used for 2-factor authentication and text message relaying, Twilio is critical infrastructure for most applications that implement it. The service must remain highly available even in times of peak application traffic, or outages at a particular cloud provider.

When he was at Twilio, James Burns worked on platform infrastructure and observability. James was at Twilio from 2014 to 2017, a time in which the company experienced rapid scalability. His work encompassed site reliability, monitoring, cost management and incident response. He also led chaos engineering exercises called “game days”, in which the company deliberately caused infrastructure to fail in order to ensure the reliability of failover systems and to discover problematic dependencies.

James joins the show to talk about his time at Twilio and his perspectives on how to instrument and observe complex applications. Full disclosure: James now works at LightStep, which is a sponsor of Software Engineering Daily.


Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.


Netlify is a modern way to build and manage fast, modern websites that run without the need for addressable web servers. Netlify is “serverless.” Automatic forms, identity management, and tools to manage and transform large images and media. Learn more about Netlify’s powerful platform at netlify.com/sedaily.

Flatiron School can teach you the skills you need to build a career that you will love. Flatiron School has immersive programming courses on JavaScript and Ruby–everything you need to become a full stack developer. Flatiron School also has courses on Python, SQL, and machine learning, for data scientists in training. You can learn in-person or online–and you can find everything you need to get started by going to flatironschool.com/sedaily.

Linode enables developers to deploy scalable compute, storage, and networking solutions. Experience our new open source Cloud Manager, explore our robust API and CLI, and run power hungry applications on our Dedicated CPU plans. Visit linode.com/sedaily and sign up today with code ‘sedaily2019’ and get $20 credit towards your new account.

GoCD is a continuous delivery tool created by ThoughtWorks. It’s great to see the continued progress on GoCD with the new Kubernetes integrations–and you can check it out for yourself at gocd.org/sedaily.

Software Weekly

Software Weekly

Subscribe to Software Weekly, a curated weekly newsletter featuring the best and newest from the software engineering community.