Dask Heartbeat

James Bourbeau March 4, 2021

Dask Heartbeat by Coiled: 2021-03-04

, , ,


Introduction

The Dask community is highly distributed with different teams working independently. This is powerful but sometimes makes it hard for people within the community to see everything that is going on. The Dask Heartbeat by Coiled is a monthly publication intended to centralize and broadcast Dask news over the previous four weeks.  

If you want something added to this list either send an e-mail at info@coiled.io, or tweet and tag @dask_dev and we’ll try to include it. Keep reading for the latest updates.

Dask Heartbeat

Dask Distributed Summit 

The Dask community is hosting a remote conference from May 19-21st! Details about how to attend and/or present are available at https://summit.dask.org. Matt Rocklin (Coiled) also wrote up some thoughts on Dask conferences in general. 

Release 

Both Dask and Distributed version 2021.02.0 were released on February 5, 2021.

Dask monthly community meeting 

The March Dask community meeting was quite active. Some highlights from the meeting are:

  • Ben Zaitlen (NVIDIA) and Julia Signell (Saturn Cloud) were added as new Dask GitHub organization owners (thanks to both of them for all their hard work).
  • Tom Augspurger (Microsoft) demoed a new library, stac-vrt, for efficiently building a GDAL VRT from a STAC collection (a mosaic of a bunch of images).
  • Simon Perkins (South African Radio Astronomy Observatory) demonstrated how he and his team have been using Dask’s new annotation machinery to increase the performance of large array workloads by reducing dependency transfers between workers. We look forward to Simon’s blog post on this topic.
  • The Dask GitHub organization will start experimenting with Orbit to track community engagement.
  • The Dask community is thinking about hiring a community manager. This would be a semi-technical role to track the issue tracker, curate other systems like Slack/Discourse, engage with developers, organize blog posts, and more.

Full meeting notes are available here

Dask Life Science Fellow

Genevieve Buckley recently began as Dask’s Life Science Fellow where she will improve Dask specifically for life sciences. She recently published a blog post about getting to know the life science community.

Python 3.9 support

Guido Imperiale (Coiled) added Python 3.9 support to both Dask and Distributed

Accelerating the Dask scheduler livestream

Matt Rocklin (Coiled) gave a talk about ongoing work to accelerate the Dask scheduler. You can see a recording of his talk by clicking below.

NEP-35 support

Peter Entschev (NVIDIA) recently added NEP-35 support to Dask.

Dask and yt

Check out this blogpost by Chris Havlin which provides an update and proposal on Dask and yt development. 

Graph manipulation methods

Guido Imperiale (Coiled) recently added a collection of advanced Dask graph manipulation methods to Dask. See the docs to learn more. 

High throughput computing (HTC) with Dask workshop

CECAM organized a HTC with Dask workshop this February which includes several seminars/tutorials that cover topics in the Dask HPC ecosystem. You can learn more about the workshop program (including video recordings) here.

Micro-optimizing and refactoring the Distributed scheduler

John Kirkham (NVIDIA) has continued making micro-optimization of the scheduler as part of a larger effort to boost performance:

You’re All Caught Up On Dask

That’s it. Thanks for reading all.

If you’re interested in taking Coiled Cloud for a spin, which provides hosted Dask clusters, docker-less managed software, and zero-click deployments, you can do so for free today when you click below.