Last month I announced that I was forming a Dask company.
This month I am pleased to announce Coiled Computing, a Dask company.
This post outlines what this company will do, and the various choices that I’ve made in its configuration.
You may also want to …
What will this company do?
Coiled Computing helps companies and other institutions scale Python with Dask. We will do this in a number of ways:
- Training and long-term support for scaling Python with Dask
- Managed deployments of Dask in an institutional setting
- Optimizing important workflows with increased visibility and monitoring
Today, we can offer training, long term support, and help getting started with open source deployment solutions. Over time we will also build proprietary products, mostly aimed at solving enterprise IT needs. (more on my thoughts on how to handle the OSS/Proprietary split in an upcoming blogpost)
I’m going to go into more depth on each of these topics below, but first I want to talk about what I’m looking for short-term.
What we need
Currently we’re looking for three kinds of people:
- Companies and other institutions who want to be early customers
These tend to be larger institutions who have an appetite for scalable Python. They want to purchase significant amounts of training/support today to build out their capabilities, and are excited about participating in the design and testing process of software products that they will want to purchase in the future.
- Engineers who understand open source software, the PyData stack, and effective communication
These folks will work with early customers to identify bugs and features in the open source software, and then improve that open source software to meet their needs. Dask touches a lot of the PyData software ecosystem today, and a lot of those connections need to be improved as we scale out. Incentives between for-profit companies and the open source software community are well aligned here and I think that we can all do some great work together.
- Engineers who understand Kubernetes, cloud deployment, enterprise authentication, and security.
We need to make it smoother for large teams of data scientists to use Dask effectively within an organization. For this we need engineers who deeply understand the pain of enterprise deployment. This is likely to be not entirely open source, and something that we charge for long-term.
We’re looking to build a diverse and remotely distributed team.
For more information, see coiled.io/jobs
What we plan to do
This company will offer support and products around Dask. Today, we can offer support. This will inform product development.
Support and Training for Dask
We want to make institutions successful at scaling Python. Usually the first step along this path isn’t buying a particular product, it’s learning more about where tools like Dask can change the way we think about accessing distributed compute resources.
To this end, we offer training and initial high level consultations to show people what’s possible, and then we sell long-term support to make sure that Dask continues to work for their needs. This gives assurance to depend on software.
Historically Dask has gotten funding from lots of small short-term consulting arrangements, typically charging by the hour. We’re now trying to move past this model, and towards longer-term partnerships. This seems to be possible today (many companies know that they want to invest in Dask long-term) and this gives us the stability we need to reliably hire out a larger team to support the use of the project.
There are many ways to deploy Dask today. If you know what you’re doing and have access to a Kubernetes/Cloud/HPC cluster
you can get to the point where you can click a button in a Jupyter Notebook and seamlessly connect to thousands of remote machines. It’s pretty magical… if you know what you’re doing.
However, there are many potential users of Dask today who are not deeply familiar with Kubernetes/Cloud/HPC. Often they depend on IT administrators to manage their compute resources, and make it easy for their teams to access these resources in a friendly and secure way that is in line with their institutions security and governance policies.
Today the best solution for this is likely Dask Gateway which provides a centralized place for IT to control access between Dask users and institutional hardware.
There is still plenty to do here around authentication, software environments, usage shaping, and more. We are excited to work with early customers to understand this pain, and build products to remove it.
Dask users today seem to appreciate the dashboard, which gives live real-time feedback that is critical to efficient computing.
When you support dozens or hundreds of users this information has increased value, particularly if you have to pay a large and increasing commerical cloud bill.
We intend to build infrastructure to capture diagnostic information and roll it up for institutional users.
Where we are today
We’re still very early. We’re lining up an initial set of partner clients, and a set of our first team members.
Today we can help customers start using Dask, and ensure that they are successful in its use.
Over the next several months we hope to be able to roll out managed deployments and observability products with these early customers, and from there out to the general public.
I’ve chosen to take on some investment to help make this happen with low-risk. Investment and OSS is an interesting topic these days. I’ll include more information on how and why I’m doing this in a follow-up post.
If you are interested in getting involved please …