Workload Scheduling with Brian Grant

By SE Daily

Podcast Friday, March 29 2019

Podcast: Play in new window | Download

Subscribe: RSS

Upcoming events:

A Conversation with Haseeb Qureshi at Cloudflare on April 3, 2019

FindCollabs Hackathon at App Academy on April 6, 2019

Google has been building large-scale scheduling systems for more than fifteen years.

Google Borg was started around 2003, giving engineers at Google a unified platform to issue long-lived service workloads as well as short-lived batch workloads onto a pool of servers. Since the early days of Borg, the scheduler systems built by Google have matured through several iterations. Omega was an effort to improve the internal Borg system, and Kubernetes is an open source container orchestrator built with the learnings of Borg and Omega.

A scheduling system needs to be able to accept a wide variety of workload types and find compute resources within a cluster to schedule those workloads onto.

There is a wide variety of potential workloads that could be scheduled–batch jobs, stateful services, stateless services, and daemon services. Different workloads can have different priority levels. A high priority workload should be able to find compute resources quickly, and a low priority workload can wait longer to find resources.

Brian Grant is a principal engineer at Google. He joins the show to talk about his experience building workload schedulers and designing APIs for engineers to interface with those schedulers.

Transcript

Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.