Netflix Genie with Tom Gianos
Podcast: Play in new window | Download
Subscribe: RSS
“Sometimes there’s a misconception that Genie is a job scheduling platform… Genie really represents our extraction layer, from what our computational resources are, to our end user jobs.”
Genie is an open-source tool that provides job and resource management for the Hadoop ecosystem in the cloud.
Tom Gianos is a Senior Software Engineer at Netflix focusing on its big data platform. He is one of the core contributors in charge of maintaining and improving Genie.
Questions
- How fast is the pace of Netflix data collection increasing?
- What are the differences between running Hadoop in the cloud and running it in a data center?
- Why is job and resource management so important for a Hadoop ecosystem?
- What does Netflix do with spare computational resources?
- What is a bonus cluster?
- Does Netflix collect more data than it can practically make use of?
Links
- Genie
- Yarn – Resource Scheduling Layer
- Hadoop Platform as a Service in the Cloud
- SE Daily Episode on Presto
- Spring Boot