Notebooks at Netflix with Matthew Seal
Podcast: Play in new window | Download
Subscribe: RSS
Netflix has petabytes of data and thousands of workloads running across that data every day. These workloads generate movie recommendations for users, create dashboards for data analysts to study, and reshape data in ETL jobs, to make it more accessible across the organization.
Over the last ten years, data engineering has become a key component of what makes Netflix successful. There are many different engineering roles who interact with the data infrastructure–including data analyst, machine learning scientist, analytics engineer, and software engineer.
Data engineering at Netflix has come a long way from the days of Hadoop MapReduce jobs running nightly, and generating reports of the most popular movies.
As data engineering and data science has grown, the tooling has expanded. The people in different data roles at Netflix might use Apache Spark, Presto, Python, Scala, SQL, and many other applications to study data–but in recent years, there is one tool that has stood out for its ability to be distinctly useful: Jupyter Notebooks.
A Jupyter Notebook lets users create and share documents that contain live code, visualizations, documentation, and many other types of components. In some ways, it is like a shareable IDE, that allows other people to see how you are working with your code and why you are making certain decisions. It is also a tool for building interactive, user-friendly applications–you can embed videos and images in a Jupyter notebook.
A Jupyter Notebook stores both the code and the results together in one place. By combining code with results in one document, you can have context around why a certain result came out the way it did.
Matthew Seal is a senior software engineer at Netflix, where he builds infrastructure and internal tools around Jupyter Notebooks. He joins the show to explain what problems Jupyter Notebooks solve for Netflix, and why they have quickly grown in popularity within the company.
Transcript
Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
Sponsors