Kudu with Todd Lipcon

todd-lipcon

“If you have an architecture where you’re trying to periodically trying to dump from one system to the other and synchronize, you can simplify your life quite a bit by just putting your data in this storage system called Kudu.”

Kudu is an open-source storage engine for the Hadoop ecosystem. It balances the advantages of both HDFS and HBase, by allowing for performant random-access queries while also providing fast writes and scans for analytics.

Todd Lipcon is a software engineer at Cloudera who leads the development of Kudu.

Questions

  • What are some typical uses cases for HDFS and HBase?
  • What are the downsides to using binary formats like Apache Parquet?
  • What is the difference between a database and a storage system?
  • Why are replication and consensus so important to a Hadoop cluster?
  • How does Kudu encourage better CPU efficiency?
  • What do you see as the future of the Hadoop ecosystem?

Links

Sponsors

Hired.com is the job marketplace for software engineers. Go to hired.com/softwareengineeringdaily to get a $2000 bonus upon landing a job through Hired.

Digital Ocean is the simplest cloud hosting provider. Use promo code SEDAILY for $10 in free credit.

Comments