Peloton: Uber’s Cluster Scheduler with Min Cai and Mayank Bansal
Google’s Borg system is a cluster manager that powers the applications running across Google’s massive infrastructure. Borg provided inspiration for open source tools like Apache Mesos and Kubernetes.
Over the last decade, some of the largest new technology companies have built their own systems that fulfill the roles of cluster management and resource scheduling. Netflix, Twitter, and Facebook have all spoken about their internal projects to make distributed systems resource allocation more economical. These companies find themselves continually reinventing scheduling and orchestration, with inspiration from Google Borg and their own internal experiences running large numbers of containers and virtual machines.
Uber’s engineering team has built a cluster scheduler called Peloton. Peloton is based on Apache Mesos, and is architected to handle a wide range of workloads: data science jobs like Hadoop MapReduce; long running services such as a ridesharing marketplace service; monitoring daemons such as Uber’s M3 collector; and database services such as MySQL.
Min Cai and Mayank Bansal are engineers at Uber who work on Peloton. When they set out to create Peloton, they looked at the existing schedulers in the ecosystem, including Kubernetes, Mesos, Hadoop’s YARN system, and Borg itself.
Both Min and Mayank join the show today to give a brief history of distributed systems schedulers and discuss their work on Peloton. They have been working in the world of distributed systems schedulers for many years–including experiences building core Hadoop infrastructure and virtual machine schedulers at VMware.
Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.