Cloudera
StreamSets: DataOps and Smart Pipelines with Arvind Prabhakar
The company StreamSets is enabling DataOps practices in today’s enterprises. StreamSets is a data engineering platform designed to help engineers design, deploy, and operate smart data
Federated Learning with Mike Lee Williams
Federated learning is machine learning without a centralized data source. Federated Learning enables mobile phones or edge servers to collaboratively learn a shared prediction model
Competition in the Open Source Ecosystem
From Eric Sammer’s answer via Quora: At Cloudera (company) we regularly work on open source code right along side our competitors. I tend to joke that the engineers at our competitors
Kudu with Todd Lipcon
“If you have an architecture where you’re trying to periodically trying to dump from one system to the other and synchronize, you can simplify your life quite a bit by just putting
Replacing Hadoop with Joe Doliner
“There are a lot more people who have the problem that Hadoop solves than there are people using Hadoop.”
Pachyderm is a containerized data analytics platform that seeks to