PlanetScale: Sharded Database Management with Jiten Vaidya and Dan Kozlowski

In the early days of YouTube, there were scalability problems with the MySQL database that hosted the data model for all of YouTube’s videos. The state of the art solution to scaling MySQL at the time was known as “application-level sharding.”

To scale a database using application-level sharding, you break up the database into shards–disjoint regions of data. When you want to query the database, you need know which shard to query. In your application code, you have to issue the query to a specific shard.

The solution of application-level sharding does scale your database. But the downside is that every application that interfaces with the database now has to include code that is aware of the sharding schema.

If you are an application engineer, you don’t want to have to worry about the way that the database is sharded, because it adds significant complexity to your code. The engineers at YouTube decided to fix this problem with a project called Vitess. Vitess abstracts away the details of sharding by orchestrating reads and writes across the distributed database.

In a previous episode, we covered the architecture, read and write path, and the story of Vitess in detail. In today’s episode, Jiten Vaidya and Dan Kozlowski of PlanetScale Data join the show to give their perspective on MySQL scalability, and their work taking Vitess to market as a solution to scaling relational databases.

Transcript

Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.


Sponsors

STELLARES is a job recommendation engine for software engineers. STELLARES uses its machine learning algorithms to factor in the subtle aspects of the job search, so that you find your perfect job — from salary to work-life balance to team fit and personal learning goals. To find out more about STELLARES, go to Stellares.ai/sedaily.

Datadog provides seamless integrations with more than 200 technologies, including AWS, Postgres, MySQL, and Docker, so you can start collecting and visualizing performance metrics quickly. 
See for yourself – start a 14-day free trial today and Datadog will send you a free T-shirt! softwareengineeringdaily.com/datadog

Deploy infrastructure faster; simplify life cycle maintenance for your servers; give IT the ability to deliver infrastructure to developers as a service like the public cloud. Go to softwareengineeringdaily.com/hpe and learn about how HPE OneView can improve your infrastructure operations.

GoCD is a continuous delivery tool created by ThoughtWorks. It’s great to see the continued progress on GoCD with the new Kubernetes integrations–and you can check it out for yourself at gocd.org/sedaily.

Software Daily

Software Daily

 
Subscribe to Software Daily, a curated newsletter featuring the best and newest from the software engineering community.