Apache Beam with Frances Perry

Unbounded data streams create difficult challenges for our application architectures. The data never stops coming, and we are forced to assume that we will never know if or when we have seen all of our data. Some streaming systems give us the tools to deal partially with unbounded data streams, but we have to complement those streaming systems with batch processing, in a technique known as the Lambda Architecture.
Apache Beam is a unified model for defining and executing data processing workflows, and Frances Perry joins the show to explain how Beam provides a way for us to model our data processing, agnostic of whether we choose to run those workflows on Spark, Flink, or Google’s Dataflow.

Links

Sponsors

alooma Alooma is your data pipeline as a service. Alooma is a fully managed tool for pulling from different data sources–MySQL, Postgres, elasticsearch, Salesforce, and many others. Go to alooma.com/sedaily for more information.
hired-logo Hired.com is the job marketplace for software engineers. Go to hired.com/softwareengineeringdaily to get a $2000 bonus upon landing a job through Hired.

Comments