large data sets

Sort by:

Modin: Pandas Scalability with Devin Petersohn

Pandas is a Python data analysis library, and an essential tool in data science. Pandas allows users to load large quantities of data into a data structure called a dataframe, over which

Uber’s Data Visualization Tools with Ib Green

Uber needs to visualize data on a range of different surfaces. A smartphone user sees cars moving around on a map as they wait for their ride to arrive. Data scientists and operations

LinkedIn Data Engineering with Kapil Surlaker

A large social network needs to develop systems for ingesting, storing, and processing large volumes of data. Data engineering at scale requires multiple engineering teams that are