Posts by Matthew Rocklin

Coiled, one year in

Coiled, a Dask company, is about one year old. We’ll have a more official celebration in mid-February (official date of incorporation), but I wanted to take this opportunity to talk a little bit about the journey over the last year, where that has placed us today, and what I think comes next.

Dask Heartbeat

Dask Heartbeat by Coiled: 2020-12-17

The Dask community is highly distributed with different teams working independently. This is powerful but sometimes makes it hard for people within the community to see everything that is going on. The Dask Heartbeat by Coiled is a bi-weekly publication intended to centralize and broadcast Dask news over the previous two weeks.

Dask logo with matrix background

How to learn Dask in 2021

As is the guiding philosophy behind OSS, Dask is a community-driven project, and the content in this post follows suit. The open-source curriculum below pulls from diverse resources, experts, and platforms to guide you in learning Dask in 2020 via the most straightforward path possible. Enjoy!

Coiled: Dask for Everyone, Everywhere

Data scientists increasingly solve large machine learning and data problems with Python.  But historically Python struggled with parallel computing.  This led many of us in the community to make Dask, a library for parallel computing and data science for Python. Dask has been a go-to solution for scalability in the Python data science stack for …

Coiled: Dask for Everyone, Everywhere Read More »

A diagram of a multi-scheduler architecture in Kubernetes.

Dask in production: Multi-Scheduler architectures

I ran across an interesting problem yesterday: A company wanted to serve many Dask computations behind a web API endpoint. This is pretty common whenever people offer computation as a service or data as a service. Today the company uses the single-machine Dask scheduler inside of a web request, but they were curious about moving …

Dask in production: Multi-Scheduler architectures Read More »

cloudy sky at daytime

Cloud Pricing

AWS computation costs roughly the following today:   On Demand Spot CPU hour $0.04 $0.0125 GiB hour $0.0045 $0.0015 On top of that different services charge a premium:   Premium AWS EMR 40% AWS SageMaker 40% DataBricks 100% However, when you pre-commit to a large allocation then you can usually negotiate this down, and get …

Cloud Pricing Read More »

people holding miniature figures

The Unbearable Challenges of Data Science At Scale

Scaling Data Science is a Team Sport An increasing number of organizations need to scale data science to larger datasets and larger models. However, deploying distributed data science frameworks in secure enterprise environments can be surprisingly challenging because we need to simultaneously satisfy multiple sets of stakeholders within the organization: data scientists, IT, and management. …

The Unbearable Challenges of Data Science At Scale Read More »

A diagram of the promise: big data plus data science team equals profit.

Distributed Data Science for Management

Summary An increasing number of organizations need to scale data science to larger datasets and larger models. However, deploying distributed data science frameworks in secure enterprise environments can be surprisingly challenging because we need to simultaneously satisfy multiple sets of stakeholders within the organization: data scientists, IT, and management. Solving simultaneously for all sides of …

Distributed Data Science for Management Read More »

An eye in focus with the rest of the image out of focus.

Distributed Data Science for IT Professionals

Scaling Data Science is a Team Sport An increasing number of organizations need to scale data science to larger datasets and larger models. However, deploying distributed data science frameworks in secure enterprise environments can be surprisingly challenging because we need to simultaneously satisfy multiple sets of stakeholders within the organization: data scientists, IT, and management. …

Distributed Data Science for IT Professionals Read More »

Sign up for updates