Streaming with Holden Karau

By SE Daily

Podcast Tuesday, April 9 2019

Podcast: Play in new window | Download

Subscribe: RSS

RECENT UPDATES:

FindCollabs $5000 Hackathon Ends Saturday April 15th, 2019

New version of Software Daily, our app and ad-free subscription service

Software Daily is looking for help with Android engineering, QA, machine learning, and more

Distributed stream processing allows developers to build applications on top of large sets of data that are being rapidly created. Stream processing is often described as an alternative to batch processing. In batch processing, a single large computation is performed over a large, static data set. In stream processing, a computation is performed repeatedly and continuously over a data set that is being appended to.

A stream is often stored in a distributed queue such as Kafka, Kinesis, Pulsar, or Google PubSub. A stream is often processed with a stream processing tool such as Spark, Flink, Storm, or Google Cloud Dataflow.

Holden Karau is an engineer who works on open source projects at Google. She returns to the show to describe the state of stream processing and discuss modern best practices.

Transcript

Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.