Matei Zaharia

Sort by:

Spark and Streaming with Matei Zaharia

Apache Spark is a system for processing large data sets in parallel. The core abstraction of Spark is the resilient distributed dataset (RDD), a working set of data that sits in memory