Data Version Control with Dmitry Petrov

Code is version controlled through git, the version control system originally built to manage the Linux codebase. For decades, software has been developed using git for version control. More recently, data engineering has become an unavoidable facet of software development. It is reasonable to ask–why are we not version controlling our data?

Dmitry Petrov is the founder of, a company for collaborating and version controlling data sets. Dmitry joins the show to talk about how data version control works, and, the company he is building around dataset management and collaboration.

Sponsorship inquiries:


Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.

Software Daily

Software Daily

Subscribe to Software Daily, a curated newsletter featuring the best and newest from the software engineering community.