Data

Sort by:

Lakehouse Data Stack with Raj Bains

As companies move to Spark and a Lakehouse architecture, they are realizing that the data tools are lagging way behind.  You need to be a programmer to effectively use Spark and

RudderStack Engineering with Soumaydeb Mitra

Customer data pipelines power the backend of many successful web platforms. In a customer data pipeline, data is collected from sources such as mobile apps and cloud SaaS tools,

Apache Hudi with Vinoth Chandar

The data lake architecture has become broadly adopted in a relatively short period of time.  In a nutshell, that means data in it’s raw format stored in cloud object storage.  Modern

Couchbase Architecture with Ravi Mayuram

Couchbase is a distributed NoSQL cloud database. Since its creation, Couchbase has expanded into edge computing, application services, and most recently a database-as-a-service called

Trifacta with Joe Hellerstein

If you haven’t encountered a data quality problem, then you haven’t yet worked on a large enough project.  Invariably, a gap exists between the state of raw data and what an analyst