Materialize: Streaming SQL on Timely Data with Arjun Narayan and Frank McSherry
Distributed stream processing frameworks are used to rapidly ingest and aggregate large volumes of incoming data. These frameworks often require the application developer to write imperative logic describing how that data should be processed.
For example, a high volume of clickstream data that is getting buffered to Kafka needs to have a stream processing system evaluate that data to prepare it for a data warehouse, Spark, or some other queryable environment. In practice, many developers simply want to have the high volume of data become queryable in the fewest number of steps possible.
Materialize is a streaming SQL materialized view engine that provides materialized views over streaming data. The materialized views are incrementally updated over time and reconciled with new data that may have come in out of order.
Arjun Narayan and Frank McSherry are the co-founders of Materialize, a company whose technology is based on the Naiad paper, which was written at Microsoft Research. Arjun and Frank join the show to talk about modern streaming systems and their strategy for taking an academic paper and productizing it.
Sponsorship inquiries: firstname.lastname@example.org
Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.