Data Engineering

Sort by:

Cloud Dataflow with Eric Anderson

Batch and stream processing systems have been evolving for the past decade. From MapReduce to Apache Storm to Dataflow, the best practices for large volume data processing have become

Apache Beam with Frances Perry

Unbounded data streams create difficult challenges for our application architectures. The data never stops coming, and we are forced to assume that we will never know if or when we have

Stream Processing at Uber with Danny Yuan

“Be aggressive in vision, but conservative in operation.” Uber is a transportation company with a high volume of temporal spacial data, constantly being collected from the devices of
uber-eng

Alluxio and Memory-centric Distributed Storage with Haoyuan Li

“Its not really about removing disk from the picture per se – it’s more like saying, ‘how do we leverage more and more resources from DRAM?’ ” Memory is king. The cost of
alluxio

Building Software for Millenials with Anthony Sessa

“Millenials deeply care about software, in the sense where if something doesn’t work as it should, it’s forgotten immediately – if you build an app and there are bugs, you’re