How Guac scales demand forecasting to reduce food waste
Orchestrating distributed ETL and machine learning with Coiled and Dagster
Introduction: Guac's Mission#
Guac is an AI startup helping food retailers forecast demand for their products. This both keeps costs low and helps reduce food waste.
We solve food waste by predicting exactly how much each retailer will sell of every product at every store every day. Once they know that, they can order and produce the right quantities and avoid throwing out cases of fresh food.
Jack Solomon
CTO and Co-founder, Guac
Behind this simple premise lies a complex technical challenge. Guac's team has built sophisticated machine learning models that process vast retail datasets while incorporating external factors that drive consumer behavior.
For a small but rapidly growing team, the challenge wasn't just building great forecasting models—it was scaling those solutions to handle retailers with exponentially more data without rebuilding their entire infrastructure.

Modern ML for Better Forecasting#
Retail demand forecasting isn't new, but Guac's approach represents a generational leap forward. Traditional forecasting relied on time series models like ARIMA, analyzing each product independently with limited variables. Guac has reimagined this approach from the ground up.
Ten years ago, forecasting meant an ARIMA model for each product series—that was the gold standard. Today, we take much bigger datasets and learn patterns across thousands of products and hundreds of stores simultaneously.
This modern approach leverages a sophisticated technical stack:
- PyTorch provides deep learning capabilities for complex pattern recognition
- XGBoost delivers robust gradient boosting models that capture non-linear relationships
- Optuna enables hyperparameter optimization across their model ensemble
- Pandas for data manipulation and processing
What truly distinguishes Guac's approach is their integration of external data sources that traditional methods couldn't incorporate. They've built libraries of weather patterns, public holidays, local school schedules, university calendars, and even sports data—anything that might influence consumer purchasing behavior.
One particularly innovative example is their use of sports betting odds to predict demand spikes near stadiums:
We incorporate sports betting odds to predict demand near stadiums. When a game is expected to be close—which betting odds indicate reliably—more people attend or host watch parties. That means more chips and guac flying off the shelves.
This data-rich approach delivers dramatically better results but requires significant computational power. While it worked well for their initial customers, as Guac started partnering with larger retailers, the processing requirements grew exponentially.
The Scale Challenge#
The team hit a critical inflection point when they began onboarding enterprise-scale retailers. The difference in data volume was staggering—from modest inventories to retailers with hundreds of thousands of products across thousands of locations.
There's a massive difference between a small retailer with a limited selection and an enterprise chain with products in the six figures at each store—plus transaction records for all of them.
Their initial pandas setup, perfectly adequate for smaller datasets, simply couldn't handle this volume. The team needed a solution that could process hundreds of millions of rows efficiently without requiring a complete rewrite of their codebase.
This challenge was particularly acute for a small team at the time. They couldn't afford to spend months building and maintaining complex distributed infrastructure when their core value was in forecasting algorithms and retail insights.
Finding Simplicity with Coiled#
Guac's search for a scalable solution followed a deliberate path. They first adopted Dask for its pandas-like Dask Dataframe interface, which allowed them to continue using existing code with minimal changes.
We didn't want a complex migration to PySpark. We needed something quick that would drop in for pandas, and Dask Dataframes fit that requirement perfectly.
While Dask solved the API compatibility issue, running it on a single machine still faced memory constraints. As they searched for distributed computing options, they discovered Coiled—a solution that would run Dask across multiple machines without adding complexity:
Adding Coiled was just a few extra lines of code. We had already transitioned to Dask on one machine, and everything worked as expected right out of the box when we distributed it with Coiled.
The simplicity of implementation was remarkable—just a few lines of code to transform their entire infrastructure capability. For a small team focused on solving food waste rather than building infrastructure, this efficiency was invaluable.
A Complete ETL+ML Platform#
With Coiled handling their distributed computing needs, Guac has built a comprehensive machine learning platform that scales effortlessly to retailers of any size. Beyond their core ML stack, they've integrated Dagster for ETL orchestration, creating a seamless workflow from data ingestion to prediction.
This combination gives them remarkable flexibility. Most of their ETL pre-processing runs on granular partitions that can be handled without distribution, but when they encounter larger workloads, Coiled steps in automatically:
We use Coiled for about 10% of our ETL jobs—the ones with memory issues—but our forecasting runs fully on Coiled. It's beautifully integrated with Dagster; the system will spin up a cluster when needed and otherwise won't bother.
For computationally intensive machine learning tasks, Coiled has been transformative. Their hyperparameter tuning jobs can run for days across 30 machines, processing hundreds of millions of rows of data—something that would be impossible on a single workstation.
The most elegant aspect of their architecture is how the complexity remains hidden. The team writes their code once, and it runs appropriately whether on a single machine or distributed across dozens of cloud instances.
Business Impact: Scale Is No Longer an Issue#
The implementation of Coiled has eliminated scale as a barrier to Guac's business growth. With their technical foundation solid, they can confidently onboard retailers of any size without worrying about infrastructure limitations.
We no longer worry about how much data our customers bring to us. Everything just works now—we wrote one implementation, and it handles everything from small shops to major chains.
This capability has directly accelerated their customer onboarding process. What might have required custom infrastructure development for each large customer now happens smoothly within their standard workflow. The typical timeline has been compressed significantly:
Within a month to two months, we can onboard any customer, get their data integrated, add our external data points, have forecasts running, and deliver the app they need—everything's good to go.
Despite processing massive datasets across distributed infrastructure, Guac's cloud costs remain surprisingly manageable. They run exactly the resources they need when they need them, avoiding the overprovisioning that plagues many cloud deployments.
Perhaps most importantly, the engineering team now spends minimal time on infrastructure maintenance. Rather than constantly fighting with scaling issues or optimizing distributed systems, they focus almost exclusively on improving their forecasting models and delivering value to customers.
When asked how Coiled makes him feel, Jack's response was immediate and telling: "Relaxed... it's a weight off our shoulders."
The Future: Growing Without Infrastructure Constraints#
With their technical foundation firmly in place, Guac is accelerating their expansion plans. They're now working with much larger retailers across the US, developing comprehensive product suites tailored to each client's unique needs.
The flexibility of their architecture means they can focus entirely on product innovation rather than infrastructure challenges. Every retailer has different needs and peculiarities—some emphasize fresh produce, others specialty items, others staples. Guac can now address these differences through product features rather than technical limitations.
Coiled has allowed us to scale to much bigger retailers much faster without worrying about infrastructure problems or data size issues. It's all just plug and play.
Jack Solomon
CTO and Co-founder, Guac
For a small team tackling the critical challenge of food waste through better forecasting, this focus is essential. With Coiled handling the computational infrastructure, Guac can concentrate on their core mission: helping retailers order exactly what they need, reducing waste, and improving efficiency across the food supply chain.
Their success demonstrates how cloud computing, when implemented thoughtfully, doesn't just solve technical problems—it unlocks business potential that would otherwise remain trapped behind infrastructure barriers.