Dask Heartbeat

James Bourbeau April 1, 2021

Dask Heartbeat by Coiled: 2021-04-01

, ,


Introduction

The Dask community is highly distributed with different teams working independently. This is powerful but sometimes makes it hard for people within the community to see everything that is going on. The Dask Heartbeat by Coiled is a monthly publication intended to centralize and broadcast Dask news over the previous four weeks.  

If you want something added to this list either send an e-mail at info@coiled.io, or tweet and tag @dask_dev and we’ll try to include it. Keep reading for the latest updates.

Dask Heartbeat

Dask Distributed Summit 

Thanks to everyone who submitted talk, tutorial, and workshop proposals for the Dask Distributed Summit. Stay tuned for an announcement about the conference schedule and don’t forget to register for the summit

Releases

Over the last month there have been several releases in the Dask ecosystem. In particular:

Dask monthly community meeting 

Some highlights from the April Dask community meeting:

Full meeting notes are available here

Dask with PyTorch for large scale image analysis

Nicholas Sofroniew (CZI), Genevieve Buckley recently published a blog post that explores applying a pre-trained PyTorch model in parallel with Dask Array.

Dropped Python 3.6 support

Both Dask and Distributed dropped support for Python 3.6 in version 2021.03.1. This means today Dask and Distributed supports Python 3.7, 3.8, and 3.9.

Dask adds more dependencies

Dask recently added cloudpickle, partd, fsspec, and toolz as required dependencies. Previously these packages were required to use certain functionality in Dask (e.g. Dask delayed), but were upgraded to required dependencies to reduce project maintenance.

The Dask development team maintains commit rights on all of these projects. 

Main branch migration

The Dask community continues migrating the default branch name for its repositories to “main”. Recently the Dask and Distributed repositories were migrated. You can track progress here

Measuring Dask memory usage with dask-memusage

Itamar Turner-Trauring wrote a blog post on measuring memory usage with Dask.

Google Summer of Code

Dask is joining this year’s Google Summer of Code under the NumFOCUS umbrella. If you’re interested in project ideas, see this Dask wiki page.

Image segmentation with Dask

In a recent blog post, Genevieve Buckley walks through how to create a basic image segmentation pipeline, using the Dask-Image library.

Distributed computing on GPUs with Dask

Jacob Tomlinson (NVIDIA) presented a talk on using Dask with RAPIDS as part of a BlazingSQL webinar series. You can see a recording of the webinar, as well as the notebook Jacob presented, here.

You’re All Caught Up On Dask

That’s it. Thanks for reading all.

If you’re interested in taking Coiled Cloud for a spin, which provides hosted Dask clusters, docker-less managed software, and zero-click deployments, you can do so for free today when you click below.