Build vs. Buy

Could we build this thing ourselves 🤔?

Coiled deploys Dask in the cloud. Coiled also charges money. This raises the following question and our (admittedly biased) answer:

> Q: Could I just build this thing myself?

> A: Yes, you could. But that’ll cost more than buying it.

Let’s explore why. We highlight dimensions of cost to consider, and then walk through two example situations.

Value of Coiled

There are three ways we can think about build vs buy in this space:

  1. Cloud Efficiency ⚙️: You pay money to your cloud provider (AWS, GCP, Azure). Coiled makes you more efficient. Those efficiency gains offset the Coiled charges. Buying Coiled is like buying a Prius or solar panels, the efficient choice.
  2. Human Time 🏃: Humans are expensive. Building a cloud data platform takes weeks or months (depending on your needs). Multiply that by personnel costs and you easily reach levels that are more expensive than what you would pay Coiled.
  3. Company Velocity 🚀: Building distracts you from pushing your company forward. Buying lets you focus on key results and accelerate your timeline by months. How valuable is your objective to the company? What is the financial opportunity to accelerate that objective?

Any one of these objectives can justify Coiled on its own. Different situations highlight different objectives. We’ll make this concrete in the two scenarios below, comparing the cost of building vs the cost of buying:

Individual Data Scientist

You want a solution quickly, you can’t pay for much, and you want to avoid fuss.

Situation

You use Dask on your laptop and want to scale to the cloud. You consider two options:

  1. Dask Cloud provider
  2. Coiled

Computationally you use around 10,000 CPU hours a month

  • Daily batch processing of 1TB: 7,000 CPU hours
  • Ad-hoc exploration: 3,000 CPU hours

You want a tool that “just works” quickly, letting you focus on your primary task without fussing with new tools. You don’t want to buy anything complicated.

More...
Build with Dask-Cloudprovider
Costs
Activity
Time
$
- Initial Setup
Setup Cloud Account
Setup Cloud Account

Dask-Cloudprovider runs in your account, so you'll need to set this up yourself.

Fortunately, AWS and GCP make this easy.

Learn more
4 hrs
$400
Install Dask Cloudprovider
Install Dask Cloudprovider

Dask Cloudprovider is easy to install

It leverages local credentials on your laptop like a .aws/config file

Learn more
2 hrs
$200
- Monthly
Cloud Spend
Cloud Spend

One CPU-core-hour on AWS costs roughly $0.05.

This varies if you use high memory nodes, GPUs, or Spot, but those require enough engineering that most folks avoid them to start.

Learn more
10,000 hrs
$500
Wastage
Wastage
Learn more

You're going to waste cloud resources. Here are some common issues to be aware of:

  • Notebooks close unexpectedly, leaving the cluster running
  • Access data in the wrong region, charging for crozz-region data access
  • Set up clusters across availability zones, resulting in transfer cotss
  • Accidentally use instances that are too large or otherwise exotic
  • Leave lots of networking or storage resources active over months, slowly accruing charges
$300
Fuss with Docker/Cloud
Fussing with Docker/Cloud

Random shit just comes up. Issues can be obvious, like modifying your docker images constantly or issues can be one-off, like needing to update some Cloud API or spelunk down billing views. You'll need to directly manage your cloud.

Two days a month is probably conservative.

Learn more
2 days
$1,600
Total (annual)
$30,000
Dask-cloudprovider is great for getting started quickly. However, there is delayed work to maintain cloud resources, manage software environments, and track costs.

Additionally, you will accidentally create phantom resources, and accumulate wasted costs unknowingly.
Buy with Coiled
Costs
Activity
Time
$
- Initial Setup
Setup Cloud Account
Setup Cloud Account

Coiled runs in your account, so you'll need to set this up yourself.

Fortunately, AWS and GCP make this easy.

Learn more
4 hrs
$400
Install Coiled
Install Coiled

Coiled is easy to install

It leverages your local credentials on your laptop, or you can deploy it from a cloud console

Learn more
2 hrs
$200
- Monthly
Cloud Spend
Cloud Spend

Using Coiled you're able to reduce your costs in two ways:

You improve your Python/Dask code so that you do less work. This is facilitated by Coiled's observability features, idle shutdown, and more.

You use the cloud more efficiently, leveraging features like Spot, ARM, efficient networking, and more.

Learn more
5,000 hrs
$150
Coiled Surcharge
Coiled Surcharge

At 10,000 CPU-hours per month or below Coiled doesn't cost anything. If you start using more than this then you'll start getting charged, but most individual users don't use enough to interest us commercially.

We'll start asking for money when we're managing $1000 of cloud spend for you or more

Learn more
$0.00
Fuss with Cloud
Fuss with Cloud

You'll still want to track cloud infrastructure, but mostly just to inspect detailed costs and update information. Two hours a month is plenty.

Learn more
2 hrs
$200
Total (annual)
$4,800
With Coiled you spend less time in configuration. Coiled works out of the box. You start doing your work the same day, and you don’t deal with administration.

Additionally, you work knowing that you’re secure from phantom costs. Coiled doesn’t charge anything extra at this level, so you benefit for free. Your cloud costs also decrease with features like Spot-by-default, ARM, and good network settings  and observability features help you cut your overall usage in half.

Data Engineering Team

You want to empower your team to operate at scale so that everyone can operate like your top performers. You’re mildly cost sensitive, but mostly care about team velocity and getting shit done.

Situation

Your top performers use Dask effectively today with good success and you’d like to roll this out to the group. You consider two options:

  1. Dask Gateway on Kubernetes on the cloud
  2. Coiled

You have a team of five data professionals. They’re solid performers and one lead engineer is capable of setting up cloud infrastructure. Computationally, you use around 100,000 CPU hours a month.

  • Five regular jobs processing 2TB daily: 50,000 CPU hours
  • Ad-hoc exploration of four regular data scientists: 25,000 CPU hours
  • Occasional large scale training job: 25,000 CPU hours

You want a tool to empower your team so they can migrate off of single large instances and look at the entire dataset, rather than subsets. You also want something you understand and can control.

Build with Dask-Gateway
Costs

We break down costs into setup and monthly costs, assuming $100/hr for total cost of an engineer (including overhead). Details available if you hover.

Activity
Time
$
- Every Six Months
Setup
Setup

You will want to set up Kubernetes (GKE, EKS) on your cloud, then helm install dask, and do basic configuration. This is easy.

Additionally, you'll set up something to manage software environments, like conda-store, and probably tweak with Kubernetes settings a bit. In particular we recommend having at least a few different node sizes (S/M/L) and different node pools to match, as well as verify your autoscalers can rapidly scale up and down (most cloud defaults are bad for bursty computational workloads).

You'll run into other issues as well, but this is a good starting point. Getting something workable takes one individual at least a couple of sprints in our experience, often longer.

Learn more
4 weeks
$16,000
- Monthly
Maintenance
Maintenance

Maintaining a Dask Gateway + Kubernetes deployment for a team is at least a quarter FTE from one of your better FTEs. 

You'll be updating packages, cleaning up errant pods, updating Dask Gateway itself, adding users, and responding to mandatory cloud updates.

If you already have a Kubernetes team on-staff then some of this will be amortized with that group.

Learn more
1 week
$4,000
Cloud Spend
Cloud Spend

One CPU-core-hour on AWS costs roughly $0.05.

This varies if you use high memory nodes, GPUs, or Spot, but those require enough engineering that most folks avoid them to start.

You're probably not yet at the point where it makes sense to invest in Spot, ARM, and other cost savings measures.

Learn more
100,000 hrs
$5,000
Wastage
Wastage
Learn more

You're going to waste cloud resources. Here are some common issues to be aware of:

  • Notebooks close unexpectedly, leaving the cluster running
  • Access data in the wrong region, charging for crozz-region data access
  • Set up clusters across availability zones, resulting in transfer cotss
  • Accidentally use instances that are too large or otherwise exotic
  • Leave lots of networking or storage resources active over months, slowly accruing charges
$3,000
Team does dumb things
Team does dumb things

Your team of five spends at least half a day a week tracking down and debugging dumb things, probably much more. That's ok, it's part of their job, but it could be avoided with tools and experts to help.

Learn more
80 hrs
$8,000
Total (annual)
$270,000
You control your entire infrastructure and don’t pay anyone except your cloud provider and your employees. However, you spend more money configuring and tending your setup than you spend on the cloud itself.  

Additionally, your team gets stuck more often and has difficulty debugging workflows (not depicted here), leading to an acceptable, but not exciting pace. 
Buy with Coiled
Costs

Setup and maintenance costs are lower. You find efficiencies in your workflows and how to use the cloud. Coiled charges you roughly 100% upcharge at this low usage.

Activity
Time
$
- Every Six Months
Setup
Setup

In principle setting up Coiled is trivial, just coiled setup.

You may want to do some additional work though, like connecting Coiled to your container registry, setting up custom tagging for billing purposes, and clearing the security model with IT.

Learn more
2 days
$1,600
- Monthly
Maintenance
Maintenance

Maintenance is cheap. Coiled doesn't require any cloud resources when idling. You will likely cross-reference our billing with your clouds though, and probably assess your team's performance to identify future improvements.

Learn more
4 hrs
$400
Cloud Spend
Cloud Spend

You'll reduce your cloud spend in two ways:

Your team wasn't maximally efficient before (surprise!) and working with Coiled observability tools and Dask engineers you're able to find at least a 50% reduction.

Additionally, Coiled uses the cloud more efficiently, leveraging technologies like Spot, ARM, tuned instances for your workloads, and more. You're able to drop your effective CPU-hour cost to around $0.03.

Learn more
50,000 hrs
$1,500
Coiled Surcharge
Coiled Surcharge
Learn more

Coiled charges you money. This is small though relative to the performance gains for your team, lack of distraction, and cloud efficiencies.

As you use more and can predict your usage, you'll negotiate for discounted bulk rates.

$2,000
Team does dumb things
Team does dumb things

Your team still does dumb things, which take up time and burn money, but they do far less of this now that they have a tool that identifies errant behaviors, and a team of Dask/Python experts to help guide them.

Roadblocks become speedbumps.

Learn more
20 hrs
$2,000
Total (annual)
$57,500
You pay Coiled roughly $20,000 a year, but you save $200,000 in efficiency (mostly people, but cloud costs also).  

Additionally, your team moves faster. It’s hard to quantify this value, so we’ve left it off, but arguably that’s the most important aspect at this stage.

Buy for Speed

The scenarios above focus on operational costs, but velocity may be more valuable. Often delivering a key result or accelerating a timeline can be worth more than just the costs of the hardware and the team. 

Our experience is that building takes longer than expected to establish a high quality foundation, delaying key results and timelines. Conversely, teams that use products like Coiled are able to focus on what they’re trained to do best, and spend their time thinking about what’s important, delivering impact to the rest of their company.

With GitHub, Google or email.

Use your AWS, GCP, or Azure account.

Start scaling.

$ pip install coiled
$ coiled setup
$ ipython
>>> import coiled
>>> cluster = coiled.Cluster(n_workers=500)