Scaling Out: Effective Cluster Computing with Distributed Dask

This class addresses the transition from working successfully on a single server or experimenting with a minimal cluster to achieving successful, reliable, repeatable use of larger Dask compute clusters. We focus on a deep dive into all of the critical components in a distributed Dask cluster, how they work together, and how you can configure them to maximize throughput and minimize costs.

Learn Ideas and Gain Skills

Duration: one day

Prerequisites

Topics

Introduction

Distributed Dask: Cast of Characters

Basic Operation of Dask Clusters

Tasks

Distributed Data

Resource usage and Resilience

Best Practices, Debugging

Use Case Example: Orchestrating Batch ML Scoring

Q&A, Discussion