Dask Dashboard and Diagnostics

This class module focuses on understanding and practical usage of the many graphical reports provided by the Dask dashboard: from real-time data transfer information to a full statistical profiler. For each dashboard panel, we discuss the information presented and then how one might employ that information to draw practical conclusions about improving performance or code.

Learn Ideas and Gain Skills

  • What information is presented in Dask’s graphical dashboards
  • How to convert statistical information from dashboards into actionable understanding of your workload

Dask Diagnotics


Prerequisites

  • Python, basic level
  • Dask programming, basic level

Topics

Introduction

  • General challenges to distributed computing
  • Practical obstacles in common applied scenarios
  • Questions to ask when your code runs slowly or throws errors
  • Locating/activating the dashboard web pages or JupyterLab widgets

Dask Dashboard Reports

  • Resource Utilization
    • Workers
    • CPU
    • Memory (total and by key)
    • Bandwidth (by worker and type)
  • Cluster Map
  • Tasks Processing
  • Progress Bar
  • Task Stream
  • Task Graph
  • Profiler

Applying Dashboards to Investigate a Scenario

  • Workflow overview and goal
  • Sample code
  • Investigating failures
  • Investigating speed/slowdown

Review and Q & A

  • Common patterns
  • Best practices