Blog

Dask logo with matrix background

How to learn Dask in 2020

As is the guiding philosophy behind OSS, Dask is a community-driven project, and the content in this post follows suit. The open-source curriculum below pulls from diverse resources, experts, and platforms to guide you in learning Dask in 2020 via the most straightforward path possible. Enjoy!

Creating a custom software environment with Coiled

Scalable Python Deployments as a Service

James Bourbeau, Dask maintainer and software engineer at Coiled, recently joined us for a Science Thursday session on “Scalable Python Deployments as a Service”.  In this post, we summarize the key takeaways from the stream. We’ll cover:  A brief overview of Dask  An introduction to Coiled and its offerings  Spinning up a cluster on AWS …

Scalable Python Deployments as a Service Read More »

The Future of Distributed Machine Learning

We recently chatted with Andy Müller, core developer of scikit-learn and Principal Research Software Development Engineer at Microsoft. Andy is one of the most influential minds in data science with a CV to match. He shares his thoughts on distributed machine learning with open-source tools like Dask-ML as well as proprietary tools from the big cloud providers. …

The Future of Distributed Machine Learning Read More »

A look at commuter data in Kansas City

Large Scale Machine Learning for Urban Planning

The Coiled team was recently joined by Brett Naul, founding engineer at Replica, where we discussed large-scale machine learning and travel simulations for urban planning. During this session, we learned more about: Interactive products for urban planning, Building synthetic populations from large data sets like the US census, Data engineering workflow with Dask and Prefect, …

Large Scale Machine Learning for Urban Planning Read More »

green and red light wallpaper

Do we really need distributed machine learning?

We recently chatted with Andy Müller, core developer of scikit-learn and Principal Research Software Development Engineer at Microsoft. Andy is one of the most influential minds in data science with a CV to match. He shares his thoughts on distributed machine learning with open-source tools like Dask-ML as well as with proprietary tools from the …

Do we really need distributed machine learning? Read More »

Scalable Computing in Oceanography with Dask and xarray

Deepak Cherian is a physical oceanographer and project scientist at the National Center for Atmospheric Research. He recently joined us to discuss scalable computing in oceanography and how he leverages Dask, xarray (he’s a lead maintainer!), and terabyte-scale datasets to study the physics of oceans. In this post, we’ll summarize the key takeaways from the …

Scalable Computing in Oceanography with Dask and xarray Read More »

Interactive Image Processing at Scale with Napari

Nicholas Sofroniew, Imaging Tech Lead at Chan Zuckerberg Initiative, and Talley Lambert, Microscopist and Lecturer at Harvard Medical, recently joined us to chat about viewing and processing large datasets, with examples from the bio-imaging world. They’re experts in this area as developers of the napari package, which they showed us how to best use.

Large-Scale Machine Learning for Urban Planning

Brett Naul, founding engineer at Replica, joins Matt Rocklin and Hugo Bowne-Anderson to discuss large-scale machine learning and travel simulations for urban planning. Replica uses Dask to easily scale travel simulations to hundreds of millions of agents on Google Container Engine. The rich Python data science and statistical ecosystems make it easy to build new …

Large-Scale Machine Learning for Urban Planning Read More »

Sign up for updates