Intro Into Dask
This lab is a beginner-intermediate lab – all levels welcome! 🙂
The purpose of this tutorial is to introduce folks to Dask and show them how to scale their python data-science and machine learning workflows.
The materials covered are:
1. Overview of Dask – How it works and when to use it.
2. Dask Delayed: How to parallelize existing Python code and your custom algorithms.
3. Schedulers: Single Machine vs Distributed, and the Dashboard.
4. From pandas to Dask: How to manipulate bigger-than-memory DataFrames using Dask.
5. Dask-ML: Scalable machine learning using Dask.
To follow along and get the most out of this tutorial it would help if you Know:
– Programming fundamentals in Python (e.g variables, data structures, for loops, etc).
– A bit of or are familiarized with numpy, pandas and scikit-learn.
Jupyter Lab/ Jupyter Notebooks
– Your way around the shell/terminal
Please find the set up instructions here: https://github.com/coiled/dask-mini-tutorial#get-set-up
However, the most important prerequisite is being willing to learn, and everyone is welcome to tag along and enjoy the ride. If you would like to watch and not code along, not a problem.
Join us on Wednesday, November 3rd at 5:30 pm CDT to kickstart your journey with Dask!