Dask Heartbeat by Coiled: November 2021

The Coiled Team November 15, 2021

, , ,


Introduction

The Dask community is highly distributed, with different teams working independently. This is powerful but sometimes makes it hard for people within the community to see everything that is going on. The Dask Heartbeat by Coiled is a monthly publication intended to centralize and broadcast Dask news over the previous month.  

If you want something added to this list, either send an email at info@coiled.io, or tweet and tag @dask_dev, and we’ll try to include it. Keep reading for the latest updates.

Dask Heartbeat Nov 2021

Dask Discourse Community Forum

Dask has a new community forum at discourse.dask.group!

Ian Rose set this up with help and input from many Dask contributors. It is a space for the entire Dask community of users, contributors, and enthusiasts to participate in discussions, ask and answer questions, share interesting resources, and showcase their work. Be sure to check it out and introduce yourself!

dask.discourse.group

Worker State Machine Refactor

Florian Jetter worked on refactoring the Worker State Machine — the pipeline that dictates how a task (and its states like waiting, ready, executing, etc.) are handled by the Dask workers. This refactor has been crucial to help solve stability problems around deadlocked or stuck clusters.

Such problems can be difficult to debug and hard to reproduce, so the team also worked on a way for users to create a snapshot of the cluster state if the cluster froze. Instructions on how to use this will soon be provided in an updated issue template

This is a part of a broader effort to investigate and improve the stability of the Dask Distributed scheduler.

Dask for Life Sciences

Genevieve Buckley has been working as the Dask Life Science Fellow since early 2021. As a part of this, she has helped improve and maintain Dask, with a special focus on life-science applications. Genevieve has also led various outreach activities, including, organizing Dask workshops, mentoring a GSoC student, and writing community blog posts. You can learn more in the CZI OSS Update.

Genevieve’s work draws to a close in December, and we’d like to thank her for all her contributions to the Dask community. 🙂

 Genevieve presenting a slide showing the contents of the presentation, which are: a tour of scientific computing libraries using Dask, and tips to integrate Dask into existing codebases.
Genevieve presenting ‘Scaling Science leveraging Dask for life sciences’ at SciPy 2021

Update on AMM

Guido Imperiale has been working on an Active Memory Manager for the past few months:

“The Active Memory Manager, or AMM, is an experimental daemon that optimizes memory usage of workers across the Dask cluster. ” ~ Dask Distributed documentation

With dask/distributed version 2021.10.0 and above, you can enable the active memory manager in your Dask configuration file. Learn more about AMM, its policies, and how to enable it in the high-level documentation. Guido will also be publishing a blog post about AMM soon!

Documentation Updates

The Dask documentation is continuously updated. Here are some highlights from October:

Stale Issues and PRs Sprint

Dask contributors held a sprint in early October to devote attention to some long-standing issues and pull-requests across multiple Dask repositories on GitHub. They worked to triage, manage, and close over a hundred issues and PRs where discussions seemed to have stalled.

Releases

Over the month of October, both Dask and Distributed versions 2021.10.0 were released.

Dask typically strives for a two week release window, but a few releases needed to be postponed in September and October. This was due to some reported stability regressions in earlier versions, and the team wanted to ensure the stability of new releases. For more information, see 2021.9.1 and 2021.10.0.

The team is still observing issues, but they hope it will only affect only a small number of Dask users. If you are facing any problems, please reach out on the Dask issue tracker.

Dask Monthly Community Meeting 

Some highlights from the November Dask community meeting:

Full meeting notes are available here.

You’re All Caught Up On Dask

That’s it. Thanks for reading.

If you’re interested in taking Coiled Cloud for a spin, which provides hosted Dask clusters, docker-less managed software, and one-click deployments, you can do so for free today when you click below.

Try Coiled Cloud


Ready to get started?

Create your first cluster in minutes.