Databricks

Spark and Streaming with Matei Zaharia

Podcast Monday, February 26 2018

Apache Spark is a system for processing large data sets in parallel. The core abstraction of Spark is the resilient distributed dataset (RDD), a working set of data that sits in memory

Apache Spark Creator Matei Zaharia Interview

Podcast Monday, August 3 2015

Apache Spark is a fast and general engine for big data processing. Matei Zaharia created Spark, and is the co-founder of Databricks, a company using Spark to power data science.