Materialize: Streaming SQL on Timely Data with Arjun Narayan and Frank McSherry

Distributed stream processing frameworks are used to rapidly ingest and aggregate large volumes of incoming data. These frameworks often require the application developer to write imperative logic describing how that data should be processed. 

For example, a high volume of clickstream data that is getting buffered to Kafka needs to have a stream processing system evaluate that data to prepare it for a data warehouse, Spark, or some other queryable environment. In practice, many developers simply want to have the high volume of data become queryable in the fewest number of steps possible.

Materialize is a streaming SQL materialized view engine that provides materialized views over streaming data. The materialized views are incrementally updated over time and reconciled with new data that may have come in out of order.

Arjun Narayan and Frank McSherry are the co-founders of Materialize, a company whose technology is based on the Naiad paper, which was written at Microsoft Research. Arjun and Frank join the show to talk about modern streaming systems and their strategy for taking an academic paper and productizing it.

Sponsorship inquiries: sponsor@softwareengineeringdaily.com

Transcript

Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.


Software Daily

Software Daily

 
Subscribe to Software Daily, a curated newsletter featuring the best and newest from the software engineering community.