Apache Beam with Frances Perry
Unbounded data streams create difficult challenges for our application architectures. The data never stops coming, and we are forced to assume that we will never know if or when we have seen all of our data. Some streaming systems give us the tools to deal partially with unbounded data streams, but we have to complement those streaming systems with batch processing, in a technique known as the Lambda Architecture.
Apache Beam is a unified model for defining and executing data processing workflows, and Frances Perry joins the show to explain how Beam provides a way for us to model our data processing, agnostic of whether we choose to run those workflows on Spark, Flink, or Google’s Dataflow.
- Apache Beam
- Streaming 101
- Streaming 102
- The Dataflow Model
- Google Cloud Dataflow
- Fundamentals of Stream Processing with Beam
- Mobile Gaming Example
- Dataflow: Beam and Spark Comparison
|Alooma is your data pipeline as a service. Alooma is a fully managed tool for pulling from different data sources–MySQL, Postgres, elasticsearch, Salesforce, and many others. Go to alooma.com/sedaily for more information.|
|Hired.com is the job marketplace for software engineers. Go to hired.com/softwareengineeringdaily to get a $600 bonus upon landing a job through Hired.|