Data Warehouse with Christian Kleinerman

Podcast Monday, October 15 2018

Subscribe: RSS

A data warehouse provides fast access to large data sets for analytics, data science, and dashboards. A data warehouse differs from a transactional database, because you often do not need to update specific records. Because of the read-only nature of the access patterns, and the high volumes of data being queried, the design of a data warehouse is very different than a transactional database.

With a transactional database (such as MySQL or MongoDB), it is important to have consistency guarantees. For example, consider a transactional database that serves as the backend for banking applications. If multiple frontend servers are hitting that transactional database to withdraw money, you need the records to be quickly updated. You need to avoid race conditions, so that two servers cannot withdraw the entire bank account balance simultaneously from different locations.

In contrast to transactional databases, a data warehouse is often used to process a query that encompasses a big data set. For example, Netflix might want to answer the question: “how many users that watched House of Cards also watched Black Mirror?” Netflix has a lot of users, so they will want to be accessing those user records in a way that lets them scan through the records quickly.

Christian Kleinerman is the VP of product at Snowflake Computing. Snowflake’s main product is a cloud data warehouse. In today’s show, we talk about the difference between a data warehouse, a data lake, and a transactional database, and the process of moving data sets between them, often known as ETL.

This show continues our series on data engineering and data platforms. As companies accumulate more and more data, the complexity of managing that data and taking full advantage of it is escalating. Christian gives his perspective on these changing trends, and describes the plans for Snowflake to evolve as a business.

Show Notes

SE Daily

Transcript

Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.