Posts by Hugo Bowne-Anderson

Data scientist and Head of Evangelism and Marketing at Coiled, builds and critiques products for data scientists, spending most of his time speaking with data scientists about their practice and the evolving needs of the field as a whole.

The Future of Distributed Machine Learning

We recently chatted with Andy Müller, core developer of scikit-learn and Principal Research Software Development Engineer at Microsoft. Andy is one of the most influential minds in data science with a CV to match. He shares his thoughts on distributed machine learning with open-source tools like Dask-ML as well as proprietary tools from the big cloud providers. …

The Future of Distributed Machine Learning Read More »

Florian Jetter, Sr Data Scientist at Blue Yonder, joins Matt Rocklin and Hugo Bowne-Anderson to discuss supply chain analytics at scale.

Data Processing at Blue Yonder

Florian Jetter, Sr Data Scientist at Blue Yonder, joins Hugo Bowne-Anderson and James Bourbeau to discuss supply chain analytics at scale. Blue Yonder provides software-as-a-service products around supply chain management. Along such a supply chain there are billions of billions of decisions to be made, how much to order, when to ship products, how much …

Data Processing at Blue Yonder Read More »

This image promotes Science Thursday "Design Principles of Distributed Systems with Dask and PySpark" with Holden Karau, the Former (long story) Princess of the Covariance Matrix.

Design Principles of Distributed Systems

Holden Karau joins Matt Rocklin & Hugo Bowne-Anderson to discuss the design of Dask, how it compares to PySpark, and why these tradeoffs were chosen. There are many different distributed systems solving what at first glance might seem like “the same problem.” Here we’ll talk about the skeletons in the respective closets of our different …

Design Principles of Distributed Systems Read More »

green and red light wallpaper

Do we really need distributed machine learning?

We recently chatted with Andy Müller, core developer of scikit-learn and Principal Research Software Development Engineer at Microsoft. Andy is one of the most influential minds in data science with a CV to match. He shares his thoughts on distributed machine learning with open-source tools like Dask-ML as well as with proprietary tools from the …

Do we really need distributed machine learning? Read More »

Large-Scale Machine Learning for Urban Planning

Brett Naul, founding engineer at Replica, joins Matt Rocklin and Hugo Bowne-Anderson to discuss large-scale machine learning and travel simulations for urban planning. Replica uses Dask to easily scale travel simulations to hundreds of millions of agents on Google Container Engine. The rich Python data science and statistical ecosystems make it easy to build new …

Large-Scale Machine Learning for Urban Planning Read More »

Zero Click Cloud Deployments

On this week’s Science Thursday, regulars Matt Rocklin and Hugo Bowne-Anderson are joined by guests Hamel Husain (Github), Chelle Gentemann (Farallon Institute), and Jeremiah Lowin (Prefect). Usually our guests show us their distributed data science work but this time we’re turning the tables: Matt and Hugo are going to show Dask and Coiled in action …

Zero Click Cloud Deployments Read More »

A JupyterLab setup with a Jupyter Notebook, Dask task stream, Dask Progress, and Dask Cluster Map.

Dask in the Cloud

When doing data science and/or machine learning, it is becoming increasingly common to need to scale up your analyses to larger datasets. When working in Python and the PyData ecosystem, Dask is a popular tool for doing so. There are many reasons for this, one being that Dask composes well with all of the PyData …

Dask in the Cloud Read More »

Imaging Earth’s subsurface with Python and Jupyter

Lindsey Heagy, a Postdoctoral researcher in the department of statistics at the University of California Berkeley, joins Matt Rocklin and Hugo Bowne-Anderson to discuss scientific computing in the geosciences with Python and Jupyter. Her research uses geophysical data to develop models of the subsurface for locating groundwater, characterizing mineral deposits, and environmental applications. Research often …

Imaging Earth’s subsurface with Python and Jupyter Read More »

Scalable Computing in Oceanography

Deepak Cherian, a physical oceanographer and project scientist at the National Center for Atmospheric Research, joins Matt Rocklin and Hugo Bowne-Anderson to discuss scalable computing in oceanography and how he leverages Dask, Xarray, and terabyte-scale datasets to study the physics of oceans. At the National Center for Atmospheric Research, Deepak Cherian studies the physics of …

Scalable Computing in Oceanography Read More »

A graphic for Coiled's Science Thursday with Richard Evans ("Scaling Open Source Policy Models and Analyzing the Biden Plan").

Scaling Open Source Policy Models and the Biden Plan

Richard Evans, Advisory Board Visiting Fellow at the Baker Institute for Public Policy at Rice University, joins Matt Rocklin and Hugo Bowne-Anderson to discuss open source policy modeling in Python, the power of Dask, and the Biden Plan. Opening with a discussion of why models of public policy should be open, we’ll then jump into …

Scaling Open Source Policy Models and the Biden Plan Read More »

A graphic for Coiled's Science Thursday with Nicholas Sofroniew and Talley Lambert ("Interactive Image Processing at Scale").

Interactive Image Processing at Scale

Nicholas Sofroniew, Imaging Tech Lead at Chan Zuckerberg Initiative, and Talley Lambert, Microscopist and Lecturer at Harvard Medical, join Science Thursday regulars Matthew Rocklin and Hugo Bowne-Anderson to chat and code about viewing and processing large datasets, with examples from the bioimaging world. We’ll use Dask and Napari, a fast, interactive, multi-dimensional image viewer for …

Interactive Image Processing at Scale Read More »

Military personnel handling unexploded ordnance.

Bomb Detection with Dask and Machine Learning

I’m trying to identify unexploded ordnance from electromagnetic data. These are basically bombs or munitions that didn’t go off and are buried in the ground somewhere. We recently spoke with Lindsey Heagy, Postdoctoral Researcher in the Department of Statistics at UC Berkeley, about her experiences with Dask. Lindsey shared how Dask significantly decreased the time …

Bomb Detection with Dask and Machine Learning Read More »

Sign up for updates