Coiled Security

Architectures to mitigate risk in cloud computing
Coiled sets up Dask for you in your own cloud

Before users leverage Coiled for productivity, IT and security teams need to review Coiled to ensure everything meets corporate standards, and doesn't introduce undue operational risk.


This document aids this review by providing the following:

  1. High level overview of our approach to security and data privacy
  2. Low level details about what permissions Coiled needs and what metadata it tracks

But first, three frequently asked questions:

FAQ

  • Q: Does Coiled run jobs in Coiled’s cloud account, or mine?
    A: Yours. Coiled jobs run in your cloud account on your behalf.

  • Q: Can Coiled see my data?
    A: No. Not unless you explicitly give it that access, which is rare.

  • Q: So Coiled is 100% safe?
    A: No. Nothing is completely safe. Please read on.

Before users leverage Coiled for productivity, IT and security teams need to review Coiled to ensure everything meets corporate standards, and doesn't introduce undue operational risk.


This document aids this review by providing the following:

  1. High level overview of our approach to security and data privacy
  2. Low level details about what permissions Coiled needs and what metadata it tracks

But first, three frequently asked questions:

FAQ

  • Q: Does Coiled run jobs in Coiled’s cloud account, or mine?
    A: Yours. Coiled jobs run in your cloud account on your behalf.

  • Q: Can Coiled see my data?
    A: No. Not unless you explicitly give it that access, which is rare.

  • Q: So Coiled is 100% safe?
    A:
    No. Nothing is completely safe. Please read on.

Security Overview

Coiled operates with a centralized control plane in our cloud account. You give this control plane sufficient access to do things like create and destroy VM Instances in your cloud account, but you don't give it permissions to access data.

Coiled avoids access to your data and your systems. We serve as a broker between you and your cloud, setting up infrastructure so that you can have a productive and secure experience without relying on Coiled as an intermediary.

Deploy resources within your account

When a user asks for a cluster …

  1. The user's machine asks Coiled to deploy the correct resources
  2. Coiled asks your cloud provider to set up resources within your cloud account
  3. Coiled establishes a secure connection between the user and the cloud resources
  4. Coiled exits the conversation so it can't observe any data access
  5. The user communicates directly with cloud machines so all sensitive access happens entirely within your infrastructure
  6. Coiled tracks health of those resources and ensures that everything cleans up gracefully

Coiled is not present during the conversation when you and your cloud resources access your data. Your data never leaves your internal cloud network.

Data Access Credentials

Coiled’s client software (runs on your machines) ships with functionality to forward your local credentials to your cloud resources after Coiled’s control plane (runs on our machines) exits the conversation.

In this way users have full access to their data on cloud resources without sending credentials through Coiled’s network.

Metadata 

However, Coiled does track substantial information about the health of your cluster and computation. By default Coiled collects as much metadata as it can. This helps Coiled staff debug and optimize user workflows, which provides substantial cost savings. We understand that metadata collection is not always acceptable, and so Coiled’s metadata collection is highly configurable. More details below.

Professionalism and Physical Security

Coiled is maintained by professional cloud infrastructure engineers and follows best practices. Network communications are secured end-to-end. Sensitive data is encrypted at rest. Access is limited to a few individuals using secure passwords under frequent rotation, multi-factor-authentication, and so on.

Coiled Computing, Inc. is SOC 2 Type II and ISO-27001 compliant. To see evidence of our security certifications, reach out to hello@coiled.io.

Security Details

OK, but exactly what permissions do you need and exactly what metadata do you collect?

Let’s discuss precise IAM roles and metadata. We’ll use AWS terms for things in this document. Please see AWS Setup and GCP Setup for more details.

IAM Roles

Ongoing

For ongoing operation, Coiled needs the ability to manage EC2 instances, Cloudwatch Log Groups, and Docker images. These permissions are encoded more explicitly in the role below:

> IAM Roles (ongoing)
{
    "Statement": [
        {
            "Sid": "Ongoing",
            "Effect": "Allow",
            "Resource": "*",
            "Action": [
                "ec2:AuthorizeSecurityGroupIngress",
                "ec2:CreateFleet",
                "ec2:CreateLaunchTemplate",
                "ec2:CreateLaunchTemplateVersion",
                "ec2:CreateRoute",
                "ec2:CreateSecurityGroup",
                "ec2:CreateTags",
                "ec2:DeleteLaunchTemplate",
                "ec2:DeleteLaunchTemplateVersions",
                "ec2:DeleteSecurityGroup",
                "ec2:DescribeAvailabilityZones",
                "ec2:DescribeImages",
                "ec2:DescribeInstances",
                "ec2:DescribeInstanceTypeOfferings",
                "ec2:DescribeInstanceTypes",
                "ec2:DescribeInternetGateways",
                "ec2:DescribeLaunchTemplates",
                "ec2:DescribeRegions",
                "ec2:DescribeRouteTables",
                "ec2:DescribeSecurityGroups",
                "ec2:DescribeSubnets",
                "ec2:DescribeVpcs",
                "ec2:TerminateInstances",
                "ecr:BatchCheckLayerAvailability",
                "ecr:BatchGetImage",
                "ecr:GetAuthorizationToken",
                "ecr:GetDownloadUrlForLayer",
                "iam:GetInstanceProfile",
                "iam:GetRole",
                "iam:ListPolicies",
                "iam:PassRole",
                "iam:TagRole",
                "logs:CreateLogGroup",
                "logs:CreateLogStream",
                "logs:DescribeLogGroups",
                "logs:DescribeLogStreams",
                "logs:GetLogEvents",
                "logs:FilterLogEvents",
                "logs:PutLogEvents",
                "logs:PutRetentionPolicy",
                "logs:TagLogGroup",
                "logs:TagResource",
                "sts:GetCallerIdentity"
            ]
        }
    ],
    "Version": "2012-10-17"
}

Coiled does not need any data access credentials.

Setup

Additionally, when setting up, Coiled needs to define roles, create VPCs and other networking configuration. By default we ask for these permissions and handle all of this work automatically. If these permissions are sensitive then you can do this work yourself and then hand Coiled the role above. The permissions below are one-time-only.

> IAM Roles (setup)
{
    "Statement": [
        {
            "Sid": "Setup",
            "Effect": "Allow",
            "Resource": "*",
            "Action": [
                "ec2:AssociateRouteTable",
                "ec2:AttachInternetGateway",
                "ec2:CreateInternetGateway",
                "ec2:CreateRoute",
                "ec2:CreateRouteTable",
                "ec2:CreateSubnet",
                "ec2:CreateVpc",
                "ec2:DeleteInternetGateway",
                "ec2:DeleteRoute",
                "ec2:DeleteRouteTable",
                "ec2:DeleteSubnet",
                "ec2:DeleteVpc",
                "ec2:DescribeInternetGateways",
                "ec2:DetachInternetGateway",
                "ec2:DisassociateRouteTable",
                "ec2:ModifySubnetAttribute",
                "ec2:ModifyVpcAttribute",
                "iam:AddRoleToInstanceProfile",
                "iam:AttachRolePolicy",
                "iam:CreateRole",
                "iam:CreatePolicy",
                "iam:CreateServiceLinkedRole",
                "iam:CreateInstanceProfile",
                "iam:DeleteRole",
                "iam:ListPolicies",
                "iam:ListInstanceProfiles",
                "iam:ListAttachedRolePolicies",
                "iam:TagInstanceProfile",
                "iam:TagPolicy",
                "iam:TagRole"
            ]
        }
    ],
    "Version": "2012-10-17"
}

See https://docs.coiled.io/user_guide/aws_configure.html for more information.

Optionality

Some of these permissions are optional if you are comfortable turning off certain features. For example:

  • ECR permissions are only needed if you want Coiled to manage Docker images in ECR for you
  • Cloudwatch permissions are only necessary if you want Coiled to track logs
Metadata

Coiled collects metadata about your resources and computations. Some of this is critical, some is optional. We describe that metadata below:

Operational Metadata

We need this information to track and control distributed cloud resources. This operational metadata includes the following:

  • Instance health (provisioned, running, closed)

We get this information using the granted IAM roles. These can not be turned off. They are required for operation.

Performance Tracking

Performance metadata helps us debug and optimize workflows with users. Some entries are optional (✅). Others can be moved inside your cloud boundary (➡️) with mild effort on your part.

Metadata
Optional?
Hardware metrics (CPU, ...)
✅ ➡️
Dask metrics (parquet, joins,...)
✅ ➡️
Logs (system, Dask, user code)
✅ ➡️
Code Snippets
User Exceptions
Package Versions
Code Profiling

This data is collected both with Prometheus metrics, and with periodic check-ins with the Coiled control plane over secured web traffic.

Configuration

By default Coiled sets everything up for you and tracks metadata within its own database. Optionally, you can deploy Coiled in more custom cloud environments, and attach metadata storage to your own databases ( ➡️ above). Common configuration choices include the following:

  • Bring your own network:  Bring your VPC and subnets, supporting custom network routing and security perimeter for ingress/egress. Limit access to your VPN or corporate network.
  • Prometheus: Store metrics in your database for debugging and performance tracking
  • Docker: Store user-created images in your private registry (Dockerhub, Artifactory, …)
  • Logs: These are stored in your account by default. You can safely disable us having access through IAM permissions.

Final Thoughts

Like any technology, Coiled introduces operational risk. The approach above is designed to minimize risk while providing an easy, rich, and productive experience for users.  

Our experience is that users will take whatever path is easiest, even if that path is insecure and unsanctioned. Coiled crafts an easy and attractive path for all users that is also highly secure and configurable.

With GitHub, Google or email.

Use your AWS or GCP account.

Start scaling.

$ pip install coiled
$ coiled setup
$ ipython
>>> import coiled
>>> cluster = coiled.Cluster(n_workers=500)