python

Dask logo with matrix background

How to learn Dask in 2020

As is the guiding philosophy behind OSS, Dask is a community-driven project, and the content in this post follows suit. The open-source curriculum below pulls from diverse resources, experts, and platforms to guide you in learning Dask in 2020 via the most straightforward path possible. Enjoy!

Creating a custom software environment with Coiled

Scalable Python Deployments as a Service

James Bourbeau, Dask maintainer and software engineer at Coiled, recently joined us for a Science Thursday session on “Scalable Python Deployments as a Service”.  In this post, we summarize the key takeaways from the stream. We’ll cover:  A brief overview of Dask  An introduction to Coiled and its offerings  Spinning up a cluster on AWS …

Scalable Python Deployments as a Service Read More »

Florian Jetter, Sr Data Scientist at Blue Yonder, joins Matt Rocklin and Hugo Bowne-Anderson to discuss supply chain analytics at scale.

Data Processing at Blue Yonder

Florian Jetter, Sr Data Scientist at Blue Yonder, joins Hugo Bowne-Anderson and James Bourbeau to discuss supply chain analytics at scale. Blue Yonder provides software-as-a-service products around supply chain management. Along such a supply chain there are billions of billions of decisions to be made, how much to order, when to ship products, how much …

Data Processing at Blue Yonder Read More »

This image promotes Science Thursday "Design Principles of Distributed Systems with Dask and PySpark" with Holden Karau, the Former (long story) Princess of the Covariance Matrix.

Design Principles of Distributed Systems

Holden Karau joins Matt Rocklin & Hugo Bowne-Anderson to discuss the design of Dask, how it compares to PySpark, and why these tradeoffs were chosen. There are many different distributed systems solving what at first glance might seem like “the same problem.” Here we’ll talk about the skeletons in the respective closets of our different …

Design Principles of Distributed Systems Read More »

green and red light wallpaper

Do we really need distributed machine learning?

We recently chatted with Andy Müller, core developer of scikit-learn and Principal Research Software Development Engineer at Microsoft. Andy is one of the most influential minds in data science with a CV to match. He shares his thoughts on distributed machine learning with open-source tools like Dask-ML as well as with proprietary tools from the …

Do we really need distributed machine learning? Read More »

Scalable Python Deployments as a Service

James Bourbeau, Dask core contributor and maintainer who works at Coiled building tools for scalable computing, joins Hugo Bowne-Anderson to discuss and code about scalable data science deployments as a service and how he thinks about these things at Coiled.  Coiled Cloud is an opinionated deployment-as-a-service product/library for scaling Python data science and machine learning …

Scalable Python Deployments as a Service Read More »

Interactive Image Processing at Scale with Napari

Nicholas Sofroniew, Imaging Tech Lead at Chan Zuckerberg Initiative, and Talley Lambert, Microscopist and Lecturer at Harvard Medical, recently joined us to chat about viewing and processing large datasets, with examples from the bio-imaging world. They’re experts in this area as developers of the napari package, which they showed us how to best use.

A four-quadrant graph with model size on the y-axis and data size on the x-axis.

Big Data vs. Big Model: Scaling Your ML Workflow

Tom Augspurger, Data Scientist at Anaconda and lead maintainer of Dask-ML, recently joined us to discuss how he likes to think about scalable machine learning in Python. As Tom shared with us on the live stream, “You have your machine learning workflow that works well for small problems. Then there are different types of scaling …

Big Data vs. Big Model: Scaling Your ML Workflow Read More »

A graphic for Coiled's Science Thursday with Richard Evans ("Scaling Open Source Policy Models and Analyzing the Biden Plan").

Scaling Open Source Policy Models and the Biden Plan

Richard Evans, Advisory Board Visiting Fellow at the Baker Institute for Public Policy at Rice University, joins Matt Rocklin and Hugo Bowne-Anderson to discuss open source policy modeling in Python, the power of Dask, and the Biden Plan. Opening with a discussion of why models of public policy should be open, we’ll then jump into …

Scaling Open Source Policy Models and the Biden Plan Read More »

Prefect logo with slogan "The new standard in dataflow automation."

Dataflow Automation with Prefect and Dask

Our first #ScienceThursday was so much fun we can’t wait to do it again this week! And we’re excited to announce that our good friends at Prefect will be joining to show us how they leverage Dask for their modern workflow orchestration system: Prefect was built to help you schedule, orchestrate and monitor your data …

Dataflow Automation with Prefect and Dask Read More »

Sign up for updates