FiloDB with Evan Chan

filodb

evan chan

“The world is becoming more and more interactive, and people want answers right away, so you’re seeing the rise of stream processing and real-time.”

Big data is yesterday–fast data is now. FiloDB is a reactive columnar OLAP database that is built on Cassandra and Spark. Today’s guest is Evan Chan, creator of FiloDB.

In our discussion today, we talk about the use cases of an OLAP data store. Evan explains how to tackle the problem of video analytics–if you have ever found yourself asking how a company like YouTube or Netflix or Ooyala performs analytics on millions of users watching millions of videos, this episode is for you. By combining the database features of Cassandra with the data processing power of Spark, Evan created FiloDB to help solve this type of analytics problem. Evan will also be presenting at Strata + Hadoop World in San Jose. We’re partnering with O’Reilly to support this conference – if you want to go to Strata, you can save 20% off a ticket with our code PCSED.

Questions

  • What does your quote “big data is yesterday, fast data is now” mean?
  • Why is it hard to solve for big data and fast data at the same time?
  • Prior to your work at FiloDB, what were the options to build the type of OLAP system you were looking for?
  • Why is Spark a good companion for Cassandra?
  • How does FiloDB work?
  • What is a parquet-style layout, and why does this benefit FiloDB?
  • What technologies are used in the “no lambda” stack?
  • What is at the frontier of this problem of big data meets fast data?

Links

Software Daily

Software Daily

 
Subscribe to Software Daily, a curated newsletter featuring the best and newest from the software engineering community.