Dask Heartbeat by Coiled: August 2021
• August 26, 2021
The Dask community is highly distributed with different teams working independently. This is powerful but sometimes makes it hard for people within the community to see everything that is going on. The Dask Heartbeat by Coiled is a monthly publication intended to centralize and broadcast Dask news over the previous month.
- In the Dask JupyterLab Extension, all plots are now ordered alphabetically.
- Individual plots can now be accessed using a dropdown menu in the Diagnostic Dashboard.
- The issue with jitter/flickers in the ‘Worker Memory’ plot has been fixed.
- The ‘Status’ page in the Diagnostic Dashboard uses tabs to show ‘CPU Utilization’ and ‘Occupancy’, alongside ‘Task Processing’.
Freyam Mehta, Genevieve Buckley, Jacob Tomlinson, and others are doing exciting work around making task scheduling faster using high-level graphs. You can read more about the overall objectives in Faster Scheduling. As Genevieve writes in High Level Graphs update, there is ongoing work to use a Blockwise high-level graph layer wherever possible, investigate a high-level graph for Dask’s `map_overlap`, and visualize high-level graphs in Jupyter Notebooks.
NumPy histogramming API in dask.array
Doug Davis helped add support for Dask Array equivalents of NumPy’s `histogram2d` and `histogramdd` functions. This feature is available in Dask version 2021.07.1 and above.
Ongoing Improvements to Memory Management and Scheduling
Guido Imperiale has continued working on active memory management and as of version 2021.07.2, the MALLOC_TRIM_THRESHOLD_ environment variable is set automatically on workers. Gabe Joseph from Coiled also continued improving Dask’s memory scheduling by short-circuiting root-ish checks for some group dependencies.
Over the month of July, both Dask and Distributed versions 2021.07.0, 2021.07.1, and 2021.07.2 were released.
Dask Monthly Community Meeting
Some highlights from the July Dask community meeting:
- RAPIDS CuDF and CuML version 21.08.00 was released in early August.
- Julia Signell and Jacob Tomlinson are working to improve and reorganize the Dask documentation. You can follow the discussion in this community issue.
Full meeting notes are available here.
You’re All Caught Up On Dask
That’s it. Thanks for reading.
If you’re interested in taking Coiled Cloud for a spin, which provides hosted Dask clusters, docker-less managed software, and one-click deployments, you can do so for free today when you click below.