Hadoop

Sort by:

Uber’s Big Data Platform: 100+ Petabytes with Minute Latency

This article was originally written by Reza Shiftehfar on Uber’s Engineering Blog. Reposted with permission from Uber Engineering. Uber is committed to delivering safer and more

Uber’s Data Platform with Zhenxiao Luo

When a user takes a ride on Uber, the app on the user’s phone is communicating with Uber’s backend infrastructure, which is writing to a database that maintains the state of that

Dremio with Tomer Shiran

The MapReduce paper was published by Google in 2004. MapReduce is an algorithm that describes how to do large-scale data processing on large clusters of commodity hardware. The MapReduce

Alluxio and Memory-centric Distributed Storage with Haoyuan Li

“Its not really about removing disk from the picture per se – it’s more like saying, ‘how do we leverage more and more resources from DRAM?’ ” Memory is king. The cost of
alluxio

Competition in the Open Source Ecosystem

From Eric Sammer’s answer via Quora: At Cloudera (company) we regularly work on open source code right along side our competitors. I tend to joke that the engineers at our competitors