Sort by:

Uber’s Data Platform with Zhenxiao Luo

When a user takes a ride on Uber, the app on the user’s phone is communicating with Uber’s backend infrastructure, which is writing to a database that maintains the state of that

Dremio with Tomer Shiran

The MapReduce paper was published by Google in 2004. MapReduce is an algorithm that describes how to do large-scale data processing on large clusters of commodity hardware. The MapReduce

Alluxio and Memory-centric Distributed Storage with Haoyuan Li

“Its not really about removing disk from the picture per se – it’s more like saying, ‘how do we leverage more and more resources from DRAM?’ ” Memory is king. The cost of

Competition in the Open Source Ecosystem

From Eric Sammer’s answer via Quora: At Cloudera (company) we regularly work on open source code right along side our competitors. I tend to joke that the engineers at our competitors

Hadoop: Past, Present and Future with Mike Cafarella

“HDFS is going to be a cockroach – I don’t think its ever going away.” Hadoop was created in 2003. In the early years, Hadoop provided large scale data processing with MapReduce,