Dask: Scalable Python with Matthew Rocklin

Python is the most widely used language for data science, and there are several libraries that are commonly used by Python data scientists including Numpy, Pandas, and scikit-learn. These libraries improve the user experience of a Python data scientist by giving them access to high level APIs.

Data science is often performed over huge datasets, and the data structures that are instantiated with those datasets need to be spread across multiple machines. To manage large distributed datasets, a library such as scikit-learn can use a system called Dask. Dask allows the instantiation of data structures such as a Dask dataframe or a Dask array.

Matthew Rocklin is the creator of Dask. He joins the show to talk about distributed computing with Dask, its use cases, and the Python ecosystem. He also provides a detailed comparison between Dask and Spark, which is also used for distributed data science.

Sponsorship inquiries: sponsor@softwareengineeringdaily.com

Transcript

Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.


Sponsors

G2i is a hiring platform run by engineers that matches you with React, React Native, GraphQL, and mobile engineers who you can trust. Whether you are a new company building your first product or an established company that wants additional engineering help, G2i has the talent you need to accomplish your goals. Go to softwareengineeringdaily.com/g2i

F5 Cloud Services builds fast, reliable load balancing and DNS services. F5 Cloud Services provides global DNS infrastructure for lightning fast access around the world. If you are looking for a scalable, high quality DNS provider, visit f5.com/sedaily, and get a free trial of F5 Cloud Services.

X-Team is a company that can help you scale your team with new engineers. X-Team has thousands of proven developers in over 50 countries ready to join your team. X-Team is able and ready to support a full range of team/project needs. If your development team could use some firepower via some of the top engineering talent in the world, visit x-team.com/sedaily.

Vettery is an online hiring marketplace that connects highly qualified workers with top companies. Vettery keeps the quality of workers and companies on the platform high, because they vet both workers and companies. Check out vettery.com/sedaily, and get a $300 sign-up bonus if you accept a job through Vettery.

Software Weekly

Software Weekly

Subscribe to Software Weekly, a curated weekly newsletter featuring the best and newest from the software engineering community.