DBT: Data Build Tool with Tristan Handy

Podcast Monday, March 9 2020

Subscribe: RSS

A data warehouse serves the purpose of providing low latency queries for high volumes of data. A data warehouse is often part of a data pipeline, which moves data through different areas of infrastructure in order to build applications such as machine learning models, dashboards, and reports.

Modern data pipelines are often associated with the term “ELT” or Extract, Load, Transform. In the “ELT” workflow, data is taken out of a source such as a data lake, loaded into a data warehouse, and then transformed within the data warehouse to create materialized views on the data. Data warehouse queries are usually written in SQL, and for the last 50 years, SQL has been the primary language for executing these kinds of queries.

DBT is a system for data modeling that allows the user to write queries that involve a mix of SQL and a templating language called Jinja. Jinja allows the analyst to blend imperative code along with the declarative SQL. Tristan Handy is the CEO of Fishtown Analytics, the company that created DBT, and he joins the show to discuss how DBT works, and the role it plays in modern data infrastructure.

Sponsorship inquiries: sponsor@softwareengineeringdaily.com

Transcript

Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.