Processing Unstructured Data with Dask Bag

This class module focuses on Dask Bag, a functional-programming pattern for distributed computation over unstructured or heterogeneous data.

Dask Bag is useful for initial processing of unstructured text, large collections of heterogeneous business records which require special processing, images or diagrams, etc. The class focuses on functional style, the Bag API, and best practices.

Learn Ideas and Gain Skills

Duration: half-day or full day

Prerequisites

Topics

Introduction

Core Bag APIs and Operations

Best Practices