Databricks
Databricks Unity Catalog with Zeashan Pappa
Data catalogs are one way to address the tension between wanting to use all the data for business advantage and needing to govern all the data for compliance. Today, Zeashan Pappa, a
Data Mechanics: Data Engineering with Jean-Yves Stephan
Apache Spark is a popular open source analytics engine for large-scale data processing. Applications can be written in Java, Scala, Python, R, and SQL. These applications have flexible
Data Lakehouse with Michael Armbrust
A data warehouse is a system for performing fast queries on large amounts of data. A data lake is a system for storing high volumes of data in a format that is slow to access. A typical
Spark Geospatial Analytics with Ram Sriharsha
Phones are constantly tracking the location of a user in space. Devices like cars, smart watches, and drones are also picking up high volumes of location data. This location data is also
Spark and Streaming with Matei Zaharia
Apache Spark is a system for processing large data sets in parallel. The core abstraction of Spark is the resilient distributed dataset (RDD), a working set of data that sits in memory