Alluxio: Data Orchestration with Haoyuan Li

In 2013, the Berkeley AMPLab was a center of innovation. 

Three projects from AMPLab have turned into successful open source projects and companies: Spark, Mesos, and Alluxio. Haoyuan Li was the creator of Alluxio, and he returns to the show to discuss his journey taking Alluxio from a research project to a company that has customers including Alibaba, Baidu, Wells Fargo, and Samsung.

Alluxio is an open source data orchestration system. Alluxio allows application developers to think in terms of the latency that they require from their infrastructure rather than the details of different storage systems. Haoyuan discusses the process of integrating with gigantic companies like cloud providers, telecoms, and huge ecommerce companies.

Alluxio is also hosting an upcoming conference, the Data Orchestration Summit November 7th at the Computer History Museum in Mountain View California.

Sponsorship inquiries:

Show Notes

November 7, 2019 is the first open source Data Orchestration Summit in the Computer History Museum in Mountain View, CA. The summit brings together data engineers, data platform engineers, and data scientists to share their challenges and learnings from building and using modern analytics, AI, and cloud technologies. Featuring tech talks covering use cases, demos, best practices, and tutorials by industry experts from EA, Walmart, DBS Bank, Netflix, AWS, Rakuten, Tencent, Google, Baidu, Alibaba and more, with a focus on how to build cloud native analytics & AI platforms. Use code “SDE” to get a 50% discount.

Check out our active projects:

  • We are hiring a head of growth. If you like Software Engineering Daily and consider yourself competent in sales, marketing, and strategy, send me an email:
  • FindCollabs is a place to build open source software.
  • The SEDaily app for iOS and Android includes all 1000 of our old episodes, as well as related links, greatest hits, and topics. Subscribe for ad-free episodes.


Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.


MongoDB is the most popular document-based database built for modern application developers and the cloud era. Try MongoDB today with Atlas, the global cloud database service that runs on AWS, Azure, and Google Cloud. Configure, deploy, and connect to your database in just a few minutes. Check it out at

Triplebyte just launched their brand-new Machine Learning track! They’ll now be helping machine learning engineers find jobs in the same way that they’ve already helped generalist, front-end, and mobile engineers. See how you stack up against the industry. Go to

Meet Fastly, the edge cloud platform that powers the brands you love, like Spotify, The New York Times, and GrubHub. With Fastly, websites and apps are faster, safer, and way more scalable. Try it free at

Jaspersoft offers embeddable reports, dashboards, and data visualizations that developers love. Give users intuitive access to data in the ideal place for them to take action—within your application. To check out a sample application with embedded analytics, go to

Software Weekly

Software Weekly

Subscribe to Software Weekly, a curated weekly newsletter featuring the best and newest from the software engineering community.