Kudu with Todd Lipcon
Podcast: Play in new window | Download
Subscribe: RSS
“If you have an architecture where you’re trying to periodically trying to dump from one system to the other and synchronize, you can simplify your life quite a bit by just putting your data in this storage system called Kudu.”
Kudu is an open-source storage engine for the Hadoop ecosystem. It balances the advantages of both HDFS and HBase, by allowing for performant random-access queries while also providing fast writes and scans for analytics.
Todd Lipcon is a software engineer at Cloudera who leads the development of Kudu.
Questions
- What are some typical uses cases for HDFS and HBase?
- What are the downsides to using binary formats like Apache Parquet?
- What is the difference between a database and a storage system?
- Why are replication and consensus so important to a Hadoop cluster?
- How does Kudu encourage better CPU efficiency?
- What do you see as the future of the Hadoop ecosystem?
Links
Sponsors
Hired.com is the job marketplace for software engineers. Go to hired.com/softwareengineeringdaily to get a $600 bonus upon landing a job through Hired.
Digital Ocean is the simplest cloud hosting provider. Use promo code SEDAILY for $10 in free credit.