Podcast: Play in new window | Download
Apache Spark is a framework for fast, distributed, in-memory analysis. Apache Cassandra is a distributed database management system that provides high availability and fast throughput. Today, we are collecting fast, big data streams from user behavior, smart phones and sensors, and the disk checkpointing of and query language of Hadoop MapReduce is no longer adequate.
Tim Berglund from Datastax came on Software Engineering Daily to explain how Apache Cassandra in a popular episode a few weeks ago. On this episode, Tim returns to discuss how Spark and Cassandra can be used together to provide a stack with the analytics and storage we need for today’s distributed computing environment.