Dask Heartbeat by Coiled: 2020-11-16

Introduction

The Dask community is highly distributed with different teams working independently. This is powerful but sometimes makes it hard for people within the community to see everything that is going on. The Dask Heartbeat by Coiled is intended to centralize and broadcast Dask news over the previous week.  

If you want something added to this list either send an e-mail at info@coiled.io, or tweet and tag @dask_dev and we’ll try to include it.

CI Overhaul

With Travis CI ending unlimited free accounts for OSS projects, many of the Dask projects are switching to Github Actions.  Thank you to Jacob Tomlinson (NVIDIA) for leading this and for others like Thomas Fan and James Bourbeau helping out. See this issue for progress on migrating Dask projects to GitHub Actions.

Dask-Cloudprovider gets GCP, Digital Ocean support

This work was recently done by Jacob Tomlinson (NVIDIA) and Ben Zaitlen (NVIDIA) and provides an easy way to deploy Dask on those platforms without setting up Kubernetes (or anything really).

See https://cloudprovider.dask.org/en/latest/gcp.html

PyData Global 2020 Talks

GPU Accelerated Deconvolution Blogpost

This blogpost by John Kirkham (NVIDIA) and Ben Zaitlen (NVIDIA) shows using GPU powered deconvolution on image data.

Dask-SQL Release

Version 0.2.0 was released by Nils Braun (Bosch).  It includes an improved SQL server and the first steps towards unifying API with BlazingSQL (along with the normal slew of increased support). 

Additionally, there is a new binder for Dask-SQL made by Ray Bell (Royal Caribbean)

New dask.annotate Function

Thanks to Simon Perkins (South African Radio Astronomy) there is now a new dask.annotate context manager

with dask.annotate(priority=1):
    df = dd.read_parquet(...)

This isn’t yet plugged into the distributed scheduler, but this is a great first step to making annotations like priorities, worker restrictions, resources restrictions, retries and other attributes much easier to specify on Dask collections.

See https://github.com/dask/dask/pull/6806 

High Level Graph Rewrite

There is a long-running effort from engineers at NVIDIA, Coiled, and Capital One to move High Level Graphs directly to the scheduler.  This currently somewhat works with the development version of Dask for DataFrames, and results in relatively fast submission of graphs, and reduced graph communication time.  We’ve moved on to benchmarking and profiling.

SVD Performance and Precision Improvements

Roger Moens at Delft University, who has been going over the SVD and approximate SVD algorithms, has noted several performance and correctness improvements, and has started work here: 

Thanks also to Eric Czech (Related) for providing careful review.

Behind the Code of Dask and pandas: Q&A with Tom Augspurger

Anaconda ran an interview with Dask core maintainer Tom Augspurger.  You can read more here. 

clEsperanto Adopts Dask

The image processing library clEsperanto has added introductory support for Dask arrays:

Dask on ARM on K8s Blogpost

Holden Karau recently published a blog post on building a Dask cluster on a cluster of Raspberry Pis running Kubernetes: https://scalingpythonml.com/2020/11/03/a-first-look-at-dask-on-arm-on-k8s.html 

Dask-Gateway 0.9.0 Release

Dask Gateway version 0.9.0 was released by Jim Crist-Harif (Prefect).  This release unifies the use of normal dask-worker/dask-scheduler executables, allowing for greater composability, especially with projects like Dask-CUDA.  It also increases the set of Helm configurations, along with the standard set of bugfixes.

https://gateway.dask.org/

Dask in HPC Workshop Announcement (EU timezones)

There is a proposed “Dask in HPC” workshop announcement here:  https://github.com/dask/community/issues/110

This follows from a similar event organized last year in Turin.  This is organized by David Swenson (ENS Lyon).

RAPIDS + Prefect + Dask Blogpost

Ayush Dattagupta from RAPIDS pushed out a blog post showing using these three tools together here: https://medium.com/rapids-ai/scheduling-optimizing-rapids-workflows-with-dask-and-prefect-6fc26d011bf 

Finally, a Homage to the US election

Finally, on a light-hearted topic, Oriana Chegwidden (CarbonPlan) made this lovely image for those that were anxiously watching the election results:

Wrapping Up

That’s it. Thanks for reading all.

If you’re interested in taking Coiled Cloud for a spin, you can do so for free today when you click below.

Share

Sign up for updates