Posts by Hugo Bowne-Anderson

Data scientist and Head of Evangelism and Marketing at Coiled, builds and critiques products for data scientists, spending most of his time speaking with data scientists about their practice and the evolving needs of the field as a whole.

green and red light wallpaper

Do we really need distributed machine learning?

We recently chatted with Andy Müller, core developer of scikit-learn and Principal Research Software Development Engineer at Microsoft. Andy is one of the most influential minds in data science with a CV to match. He shares his thoughts on distributed machine learning with open-source tools like Dask-ML as well as with proprietary tools from the …

Do we really need distributed machine learning? Read More »

Large-Scale Machine Learning for Urban Planning

Brett Naul, founding engineer at Replica, joins Matt Rocklin and Hugo Bowne-Anderson to discuss large-scale machine learning and travel simulations for urban planning. Replica uses Dask to easily scale travel simulations to hundreds of millions of agents on Google Container Engine. The rich Python data science and statistical ecosystems make it easy to build new …

Large-Scale Machine Learning for Urban Planning Read More »

Zero Click Cloud Deployments

On this week’s Science Thursday, regulars Matt Rocklin and Hugo Bowne-Anderson are joined by guests Hamel Husain (Github), Chelle Gentemann (Farallon Institute), and Jeremiah Lowin (Prefect). Usually our guests show us their distributed data science work but this time we’re turning the tables: Matt and Hugo are going to show Dask and Coiled in action …

Zero Click Cloud Deployments Read More »

A JupyterLab setup with a Jupyter Notebook, Dask task stream, Dask Progress, and Dask Cluster Map.

Dask in the Cloud

When doing data science and/or machine learning, it is becoming increasingly common to need to scale up your analyses to larger datasets. When working in Python and the PyData ecosystem, Dask is a popular tool for doing so. There are many reasons for this, one being that Dask composes well with all of the PyData …

Dask in the Cloud Read More »

Imaging Earth’s subsurface with Python and Jupyter

Lindsey Heagy, a Postdoctoral researcher in the department of statistics at the University of California Berkeley, joins Matt Rocklin and Hugo Bowne-Anderson to discuss scientific computing in the geosciences with Python and Jupyter. Her research uses geophysical data to develop models of the subsurface for locating groundwater, characterizing mineral deposits, and environmental applications. Research often …

Imaging Earth’s subsurface with Python and Jupyter Read More »

Scalable Computing in Oceanography

Deepak Cherian, a physical oceanographer and project scientist at the National Center for Atmospheric Research, joins Matt Rocklin and Hugo Bowne-Anderson to discuss scalable computing in oceanography and how he leverages Dask, Xarray, and terabyte-scale datasets to study the physics of oceans. At the National Center for Atmospheric Research, Deepak Cherian studies the physics of …

Scalable Computing in Oceanography Read More »

A graphic for Coiled's Science Thursday with Richard Evans ("Scaling Open Source Policy Models and Analyzing the Biden Plan").

Scaling Open Source Policy Models and the Biden Plan

Richard Evans, Advisory Board Visiting Fellow at the Baker Institute for Public Policy at Rice University, joins Matt Rocklin and Hugo Bowne-Anderson to discuss open source policy modeling in Python, the power of Dask, and the Biden Plan. Opening with a discussion of why models of public policy should be open, we’ll then jump into …

Scaling Open Source Policy Models and the Biden Plan Read More »

A graphic for Coiled's Science Thursday with Nicholas Sofroniew and Talley Lambert ("Interactive Image Processing at Scale").

Interactive Image Processing at Scale

Nicholas Sofroniew, Imaging Tech Lead at Chan Zuckerberg Initiative, and Talley Lambert, Microscopist and Lecturer at Harvard Medical, join Science Thursday regulars Matthew Rocklin and Hugo Bowne-Anderson to chat and code about viewing and processing large datasets, with examples from the bioimaging world. We’ll use Dask and Napari, a fast, interactive, multi-dimensional image viewer for …

Interactive Image Processing at Scale Read More »

Military personnel handling unexploded ordnance.

Bomb Detection with Dask and Machine Learning

I’m trying to identify unexploded ordnance from electromagnetic data. These are basically bombs or munitions that didn’t go off and are buried in the ground somewhere. We recently spoke with Lindsey Heagy, Postdoctoral Researcher in the Department of Statistics at UC Berkeley, about her experiences with Dask. Lindsey shared how Dask significantly decreased the time …

Bomb Detection with Dask and Machine Learning Read More »

A graphic for Coiled's Science Thursday with Jacob Tomlinson ("Deploying and Scaling Data Science Tools on Distributed Systems").

Deploying and Scaling Data Science Tools on Distributed Systems

Jacob Tomlinson, who works at NVIDIA maintaining libraries like RAPIDS, Dask, Dask-Kubernetes and Dask-Cloudprovider, joins Matt Rocklin and Hugo Bowne-Anderson to discuss deployment and scaling of data science tools on distributed systems. Dask has many cluster manager utilities which help users set up distributed Dask clusters on a variety of different infrastructures. Dask’s distributed tooling …

Deploying and Scaling Data Science Tools on Distributed Systems Read More »

Tom Augspurger presenting at PyData NYC 2019.

Scalable Machine Learning in Python

Tom Augspurger, who works at Anaconda maintaining libraries like pandas, Dask, and Dask-ML, joins Matt Rocklin and Hugo Bowne-Anderson to discuss scalable machine learning in Python. Dask-ML provides tools for scalable machine learning. It works with libraries like scikit-learn and XGBoost to scale out to larger datasets or larger problems. We’re fortunate to have great, …

Scalable Machine Learning in Python Read More »

A boat travels through the Gulf Islands near Salt Spring island, Canada. (Photographer: James MacDonald/Bloomberg)

Dask in Action with Massive Satellite Datasets

TL;DR action (noun): the most vigorous, productive, or exciting activity in a particular field, area, or group. // wants to be where the action is. We recently spoke with oceanographer, remote sensing expert, and open science advocate Chelle Gentemann, about her experiences working with massive satellite datasets and how Python and Dask make the scientific …

Dask in Action with Massive Satellite Datasets Read More »

Keep up to date (weekly cadence)