Architecting Distributed Databases with Fangjin Yang


“The more you’re comfortable with this idea that everything is going to fail, the more you realize that it’s a natural process of distributed systems, and it helps you write and architect better code.”

Druid is a column-oriented distributed database that excels as a data warehousing solution for fast queries on large data sets.

Fangjin Yang is a core contributor to Druid and is currently co-founder and CEO of Imply, which helps build interactive analytics powered by Druid.


  • Why is it helpful for developers to acknowledge up front that “everything is going to fail”?
  • How often does a single server fail in cloud and distributed environments?
  • Can you describe a simple failure scenario where replication allows for recovery?
  • How do you implement fast recovery?
  • What is a hedged request?
  • What are some other strategies for minimizing variability among nodes?
  • What are some macroscopic lessons you learned about distributed systems while you were building Druid?


Software Daily

Software Daily

Subscribe to Software Daily, a curated newsletter featuring the best and newest from the software engineering community.