Dask

OpenTeams Partner Spotlight E01: Coiled

In this webinar, we’ll dive into the challenges of distributed computation for organizations: parallel libraries, such as Dask, are only useful if you both have access to parallel hardware, and the DevOps expertise to use it. This excludes many important communities.

Dask logo with matrix background

How to learn Dask in 2020

As is the guiding philosophy behind OSS, Dask is a community-driven project, and the content in this post follows suit. The open-source curriculum below pulls from diverse resources, experts, and platforms to guide you in learning Dask in 2020 via the most straightforward path possible. Enjoy!

Creating a custom software environment with Coiled

Scalable Python Deployments as a Service

James Bourbeau, Dask maintainer and software engineer at Coiled, recently joined us for a Science Thursday session on “Scalable Python Deployments as a Service”.  In this post, we summarize the key takeaways from the stream. We’ll cover:  A brief overview of Dask  An introduction to Coiled and its offerings  Spinning up a cluster on AWS …

Scalable Python Deployments as a Service Read More »

The Future of Distributed Machine Learning

We recently chatted with Andy Müller, core developer of scikit-learn and Principal Research Software Development Engineer at Microsoft. Andy is one of the most influential minds in data science with a CV to match. He shares his thoughts on distributed machine learning with open-source tools like Dask-ML as well as proprietary tools from the big cloud providers. …

The Future of Distributed Machine Learning Read More »

Florian Jetter, Sr Data Scientist at Blue Yonder, joins Matt Rocklin and Hugo Bowne-Anderson to discuss supply chain analytics at scale.

Data Processing at Blue Yonder

Florian Jetter, Sr Data Scientist at Blue Yonder, joins Hugo Bowne-Anderson and James Bourbeau to discuss supply chain analytics at scale. Blue Yonder provides software-as-a-service products around supply chain management. Along such a supply chain there are billions of billions of decisions to be made, how much to order, when to ship products, how much …

Data Processing at Blue Yonder Read More »

A look at commuter data in Kansas City

Large Scale Machine Learning for Urban Planning

The Coiled team was recently joined by Brett Naul, founding engineer at Replica, where we discussed large-scale machine learning and travel simulations for urban planning. During this session, we learned more about: Interactive products for urban planning, Building synthetic populations from large data sets like the US census, Data engineering workflow with Dask and Prefect, …

Large Scale Machine Learning for Urban Planning Read More »

This image promotes Science Thursday "Design Principles of Distributed Systems with Dask and PySpark" with Holden Karau, the Former (long story) Princess of the Covariance Matrix.

Design Principles of Distributed Systems

Holden Karau joins Matt Rocklin & Hugo Bowne-Anderson to discuss the design of Dask, how it compares to PySpark, and why these tradeoffs were chosen. There are many different distributed systems solving what at first glance might seem like “the same problem.” Here we’ll talk about the skeletons in the respective closets of our different …

Design Principles of Distributed Systems Read More »

green and red light wallpaper

Do we really need distributed machine learning?

We recently chatted with Andy Müller, core developer of scikit-learn and Principal Research Software Development Engineer at Microsoft. Andy is one of the most influential minds in data science with a CV to match. He shares his thoughts on distributed machine learning with open-source tools like Dask-ML as well as with proprietary tools from the …

Do we really need distributed machine learning? Read More »

Sign up for updates