Uber’s Monitoring Platform with Rob Skillington

Uber manages the car rides for millions of people. The Uber system must remain operational 24/7, and the app involves financial transactions and the safety of passengers.

Uber infrastructure runs across thousands of server instances and produce terabytes of monitoring data. The monitoring data is used to understand the health of the software systems as well as relevant business metrics, such as driver efficiency, daily revenues, and user satisfaction.

Uber adopted the Prometheus monitoring system to manage their monitoring data. Prometheus regularly scrapes metrics across infrastructure to gather time series data about the state of everything across Uber. As the usage of Prometheus has grown within the company, Uber has had to figure out how to scale their monitoring platform.

M3 is a monitoring system built at Uber to scale Prometheus and provide a platform that can effectively scale the data storage as well as the query serving. Rob Skillington is a staff software engineer at Uber, and he joins the show to talk about monitoring at Uber–from the requirements of the system to the implementation of M3.

At Uber, M3 powers dashboards, ad-hoc queries, and alerting. M3 was open sourced to give other users access to a scalable Prometheus solution. In a previous episode with Brian Boreham, we discussed one strategy for scaling Prometheus. Today’s episode covers another scalability solution, with M3.

 

Show notes

Transcript

Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.


Sponsors

Find a job you love on Hired. Stop job searching and get matched with 10,000+ companies looking for you. Hired combines intelligent matching technology with personalized career coaching to match you with opportunities based on your skills, industry, interests and desired salary. So stop searching and create a profile at Hired.com/sedaily and find a job you truly love.

Digital Ocean is the easiest cloud platform to run and scale your application. Try it out today and get a free $100 credit–go to do.co/sedaily. Digital Ocean is a complete cloud platform to help developers and teams save time when running and scaling their applications.

Get ready to build content-rich websites and professional web applications with Wix Code. Store and manage unlimited data with built-in databases, create dynamic pages, make custom forms and take full control of your site’s functionality with Wix Code APIs and JavaScript. Plus, now you can get 10-percent off your Premium plan. Go to Wix.com/SED.

Deploy infrastructure faster; simplify life cycle maintenance for your servers; give IT the ability to deliver infrastructure to developers as a service like the public cloud. Go to softwareengineeringdaily.com/hpe and learn about how HPE OneView can improve your infrastructure operations.

Software Daily

Software Daily

 
Subscribe to Software Daily, a curated newsletter featuring the best and newest from the software engineering community.