Apache Arrow

Columnar Data: Apache Arrow and Parquet with Julien Le Dem and Jacques Nadeau

Podcast Friday, January 13 2017

Column-oriented data storage allows us to access all of the entries in a database column quickly and efficiently. Columnar storage formats are mostly relevant today for performing large

Apache Arrow with Uwe Korn

Podcast Sunday, July 17 2016

In a typical data analytics system, there are a variety of technologies interacting. HDFS for storing files, Spark for distributed machine learning, pandas for data analysis in