How EOLAS Insight processes petabytes of Sentinel-2 imagery
Distributed earth observation data workflows with Xarray, Dask, and Coiled
Mapping Environmental Change at Scale#
The team at EOLAS Insight is helping land managers make better decisions through customized mapping tools and environmental analytics. Working at the intersection of geospatial data and conservation, they transform satellite imagery into actionable insights.
We create mapping tools for people without geospatial skills who still need the rich insights this data can provide.
Paul Naidoo
Data Scientist, EOLAS Insight
EOLAS Insight delivers unique geospatial software solutions that simplify workflows, enhance geo-capabilities, and provide automated, measurable results for those working on the ground and making environmental decisions. A vital part is processing vast amounts of satellite data quickly enough to deliver valuable insights when they matter most.

The Data Challenge: Petabyte-Scale Satellite Imagery#
EOLAS Insight works primarily with Earth observation data, including Sentinel-2 imagery with 10-meter resolution across multiple spectral bands. The size of these datasets can quickly become unwieldy.
Depending on the size of the area that we're looking at, data sets can very quickly balloon to hundreds of gigabytes and approaching the petabyte scale if the area is large enough or if we're looking at a long enough time span.
Before Coiled, the EOLAS team faced significant processing bottlenecks. Even with powerful local machines, they were constrained by Python's single-threaded nature and limited memory when working with these massive geospatial datasets.
We were often fairly bound on our CPU intensive tasks. If you're pre-processing, engineering features, cleaning up data, calculating some index from Earth observational data—we were kind of stuck in a rut of just letting it run its course. You set it running early, way before you need it, and come back to hopefully find it done when you're ready to move on.
Starting Small: Building the Right Stack#
Initially as a small environmental intelligence company, EOLAS Insight built their analytics platform on a Python ecosystem specialized for geospatial work:
We use a Python stack. We benefit from this wonderful moment in history where you havein Python all of these phenomenal, actively developed open source libraries for anything you could imagine.
Their core stack includes familiar geospatial tools:
- Xarray for multi-dimensional data with spatial and temporal coordinates
- Geopandas for vector data with geospatial extensions
- Raster processing tools for satellite imagery analysis
- Dask for initial steps toward parallel processing
Their early cloud experiments with AWS EC2 instances helped with computing power but introduced other challenges:
If we were pushing something up to the cloud because your local machine just wasn't enough, you're not getting 100% utilization out of that machine. There's an awful lot of wastage involved in running those things.
Scaling Up: First Steps with Coiled#
When the EOLAS Insight team discovered Coiled, they found a much more streamlined approach to cloud computing:
Coiled honestly was a breath of fresh air. The seamlessness of prototyping something locally and then that environment just exporting itself up, not just to one virtual machine, but a cluster of machines simultaneously all just being provisioned on the fly was one of the biggest things that made me go, 'Yeah, okay, this is a really good product.'
What made Coiled particularly valuable was how it eliminated infrastructure overhead while enhancing their existing workflow. Instead of learning Docker or managing cloud resources, the team could focus on their core expertise: extracting insights from geospatial data.
The heavy lifting that that removed from me or the need for me to train up staff or even just the overhead and mental burden of maintaining those machines and trying to track what was and what wasn't being provisioned—being able to just on the fly provision a transient cluster of workers was a game changer.
Breaking Through: Processing Petabyte-Scale Datasets#
With Coiled, EOLAS Insight was able to tackle datasets that previously seemed unmanageable. Their largest job processed nearly a petabyte of Sentinel-2 data, utilizing approximately 500 machines at its peak.
The technical transition was remarkably simple:
The fact that coiled is Python-first is incredibly useful to me. I prototype something quickly on a very small, and process it locally. Then I use Dask locally. And then the fact that I only need to change two lines of code to change from processing with Dask locally to processing with Coiled on the cloud is brilliant.
This ability to seamlessly transition from local development to cloud processing transformed their workflow:
There aren't multiple versions of the code. I'm not having to duplicate stuff. I know if it worked here, it's gonna work there. The fact that it all just lives in Python and all I need to do is tweak one line of code to define my cluster is great.
Optimizing: Fine-Tuning for Cost and Performance#
Once running at scale, EOLAS Insight was able to further optimize their operations. The Coiled dashboard became an essential tool for managing resources efficiently:
The dashboard is fantastic. It's so informational. You can see if you're not using your workers, you can see if you're running out of RAM, it's really easy to run a couple of test jobs, see where you're aiming for, and then set the batch running and know that you're really minimizing the wastage on the cloud costs for that processing.
This ability to fine-tune resource allocation had both financial and environmental benefits:
Underutilizing compute power that we're renting and paying for is also just wasteful in that direction. There's lots of power being wasted when there needn't be. I think sensible stewardship of the resources that you're using when you're processing this data goes a long way.
Another key optimization came from Coiled's ability to run jobs in specific cloud regions:
The fact that Coiled uses raw VMs means that I can choose where on the planet to put my cluster. If I know that I am pulling down raw Sentinel-2 data that lives on an S3 bucket in the US West 2 region, I can put the cluster there. This minimizes costs, but also speeds up processing because you're not moving data around the world. You're sitting right there next to it.
Business Impact: Delivering Timely Environmental Intelligence#
The performance improvements transformed EOLAS Insight's capabilities in two critical areas:
-
Faster area onboarding:
With coiled, I can radically speed up the data products that we're commercializing. When we provision a new area, can just absolutely kick through the front door with a Coiled cluster versus pushing it through that eye of the needle.
-
Time-sensitive analysis:
The other is trying to quickly process new data—whether that's a forecast or something similar—where you want to be able to receive the data, process it into your derived product, and push it out of the door really fast. What that does for our business is empower us to serve time-sensitive insights to our customers. It gives us that edge to be able to really quickly get them the actionable information out of that data that they need.
For environmental applications, timing can be critical—whether monitoring seasonal changes, tracking specific events, or responding to policy opportunities.
Looking Forward: Expanding Capabilities#
As EOLAS Insight continues to grow, they see expanding opportunities for their Coiled implementation:
We're now interested in integrating Coiled with our backend for on-the-fly processing. If you draw a polygon on a map, then web services spin up in the cloud and process all of that data. That can be slow, which isn't a great user experience. Coiled can be a drop-in replacement for some of Amazon's offerings for us.
This vision aligns perfectly with their goal of making environmental intelligence more accessible and actionable for diverse stakeholders.
Coiled enables me to process large volumes of data very, very quickly. It removes that overhead of me having to provision a machine, worrying if something is going to work. It's pulling it straight off your machine. So if it just ran on your machine, it's going to run on that cluster.
Paul Naidoo
Data Scientist, EOLAS Insight
By combining Python's geospatial ecosystem with Coiled's scalable infrastructure, this team delivers environmental insights that drive better conservation and land management decisions. Their success shows that with the right technology, even modest-sized teams can extract meaningful insights from petabyte-scale Earth observation data.