Flyte: Lyft Data Processing Platform with Allyson Gale and Ketan Umare
Lyft is a ridesharing company that generates a high volume of data every day.
This data includes ride history, pricing information, mapping, routing, and financial transactions. The data is stored across a variety of different databases, data lakes, and queueing systems, and is processed at scale in order to generate machine learning models, reports, and data applications.
Data workflows involve a set of interconnected systems such as Kubernetes, Spark, Tensorflow, and Flink. In order for these systems to work together harmoniously, a workflow manager is often used to orchestrate them together. A workflow platform lets a data engineer have a high-level view into how data moves through the system, and can be used to reason about retries, resource utilization, and scalability.
Flyte is a data processing system built and open-sourced at Lyft. Allyson Gale and Ketan Umare work at Lyft, and they join the show to talk about how Flyte works, and why they needed to build a new workflow processing system when there are already tools available such as Airflow.
Sponsorship inquiries: firstname.lastname@example.org
Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.