Apache Hudi with Vinoth Chandar

The data lake architecture has become broadly adopted in a relatively short period of time.  In a nutshell, that means data in it’s raw format stored in cloud object storage.  Modern software and data engineers have no shortage of options for accessing their data lake, but that list shrinks quickly if you care about features like transactions.  Apache Hudi is a platform for building streaming data lakes that is optimized for lake engines and batch processing.  In this episode, I interview Vinoth Chandar, creator of the Hudi Project and Founder and CEO at Onehouse.

Sponsorship inquiries: sponsor@softwareengineeringdaily.com

 

Transcript

Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com to get 15% off the first three months of audio editing and transcription services with code: SED. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript. 


Sponsors

Stack Overflow for Teams brings the power of Stack Overflow to your company. It’s an easy to use, flexible, platform that helps thousands of developers answer questions and make progress in their work. Teams features robust search functionality, so you can easily benefit from the questions and answers documented on your team. Surface the most important information about onboarding, the development lifecycle, feature releases, and more. Stack Overflow for Teams saves users time and powers up the workday by clearing the obstacles caused by unanswered questions. Try it now, create a free team: https://stackoverflow.com/teams/sedaily

 

Stream provides an easy-to-integrate chat solution for any application. With robust SDKs and an API built for ease of use, scalability, reliability, and security, product teams can focus on what makes their app unique rather than spending months on building a chat infrastructure. Stream’s feature-rich products include robust client-side SDKs for Angular, iOS, iOS Swift/UI, Android, Compose, React, React Native, Flutter, and Unreal support for the most commonly used server-side languages; scalable and secure APIs; and a beautiful UI kit. Check it out at https://getstream.io/

In a world full of applications, why do documents and spreadsheets still run the world? And why haven’t they been updated in over 50 years? Coda is a new kind of doc that brings words, data, and teams together. It comes with a set of building blocks that anyone can combine to create a doc as powerful as an app.

 PIA encrypts and reroutes your internet traffic through one of its wn servers, hiding your data from your internet service provider or network admin. And with servers in over 75 countries, you can get unrestricted access to geo-blocked content around the world. PIA comes with easy- to-use apps and browser extensions for all devices, a rock-solid privacy policy, open-source security, advanced customization settings, and it was just ranked the fastest VPN in the world by PCMag.

Go to https://privateinternetaccess.com/SEDaily and get 83% off your subscription and 4 extra months completely free, that’s $2 a month!

Hey Software Engineering Daily listeners, interested in learning more about Reverse ETL? Join a live recording of The Data Stack Show on March 9th to learn all about the tooling from the folks who are creating it. Leaders from Census, Hightouch, and Workato will join RudderStack’s Eric Dodds and Starburst data’s Kostas Pardalis to discuss topics like “Why reverse ETL, Reverse ETL use cases, and the future of reverse ETL”. Visit datastackshow.com/live to register today.

Software Daily

Software Daily

 
Subscribe to Software Daily, a curated newsletter featuring the best and newest from the software engineering community.