Accelerated Data Processing with GPU Computation and Dask

This class introduces tools for GPU-accelerated computation — including CuPy, RAPIDS, and Numba — and their integration with Dask for usage on large datasets and multiple GPUs.

Learn Ideas and Gain Skills

  • What does Dask offer — and not offer — for machine learning workflows
  • Leveraging Dask for proper out-of-core and/or parallel training
  • Implementing an end-to-end workflow with Dask and other tools

and Numba CuPy Dask Data Processing GPUs RAPIDS


  • Python, basic level
  • Understanding of ML concepts and workflow, basic level
  • Dask programming, basic level



  • Lessons from big data tools
  • NumPy and vectorized compute
  • Native code generation
  • Adding hardware acceleration

Patterns for High-Performance Python

  • Understanding Python limitations and escape hatches
  • Concurrency issues and patterns: multithreading, GIL, multiprocess, multinode
  • JIT compilation with Numba
  • PyTorch and CuPy: not just for deep learning
  • Leveraging DL libraries for “regular” ML


  • Data catalogs and using SQL on the GPU
  • Working with cuDF: CUDA DataFrame
  • A Bit About How GPU Programming Works Under the Hood
  • Numba and Python-to-CUDA compilation
  • cuDF + Numba: Custom Computation from Python
  • CUDA Machine Learning with cuML
  • Interop with PyTorch for Deep Learning and General Optimization
  • cuGraph High-Level CUDA-enabled Graph Operations

RAPIDS and Dask

  • Multi-GPU and multi-node scaling
  • Combining Dask + GPU
  • ML, dimensionality reduction, and graph algorithms on GPU

Architecture and Problem Solving

  • Extending the end-to-end workflow
  • Accelerated networking
  • Visualizations
  • Architecture, patterns, integration
  • Q & A