Apache Spark Creator Matei Zaharia Interview

 

Apache Spark is a fast and general engine for big data processing.

Matei Zaharia created Spark, and is the co-founder of Databricks, a company using Spark to power data science.

Questions:

  • What was the motivation behind creating Spark?
  • How much faster is a Spark job than a Hadoop job?
  • What is the relationship between streaming and batch processing?
  • Is Spark’s core advantage over Storm and Samza its usability?
  • How useful is containerization to Spark?
  • What standard library features are being worked on?
  • What is the future of Spark?

Links:

Comments