> ## Documentation Index
> Fetch the complete documentation index at: https://lightdash-docs-azure-container-sessions.mintlify.site/llms.txt
> Use this file to discover all available pages before exploring further.

# Sandboxes

> How Lightdash runs LLM-generated code safely in isolated sandboxes, and how to configure the sandbox provider on a self-hosted instance.

<Note>
  🛠 This page is for engineering teams self-hosting their own Lightdash instance. On Lightdash Cloud, sandboxes are fully managed for you — there's nothing to configure.
</Note>

## What sandboxes are for

Some Lightdash features use an AI agent (Claude Code) that **writes and runs code on
your behalf**. Today that's:

* **AI writeback** — the agent edits your dbt project (e.g. adds a metric or dimension),
  runs `lightdash compile` to validate it, and opens a pull request.
* **Data app generation** — the agent generates and builds a small web app from a prompt.

Running model-generated code directly on the Lightdash server would be unsafe: the code
is untrusted, can run arbitrary commands, and needs its own toolchain (git, dbt, the
Lightdash CLI, Node). Lightdash instead runs each agent inside a **sandbox** — an isolated,
disposable environment with a constrained network. The agent does its work there, Lightdash
collects the result (a PR, a built app), and the sandbox is torn down.

Sandboxes are also what make these features **fast** and **multi-turn**: a sandbox can be
suspended between turns and resumed later, so a conversation with the agent keeps its state
without holding a container open the whole time.

## Sandbox providers

The sandbox backend is pluggable. Lightdash talks to a provider-neutral interface, so the
same feature code runs on whichever backend your deployment is configured for. You select
the provider with the `SANDBOX_PROVIDER` environment variable.

| Provider                                  | `SANDBOX_PROVIDER`         | Use for                           | Isolation                  |
| ----------------------------------------- | -------------------------- | --------------------------------- | -------------------------- |
| **E2B** (default)                         | `e2b`                      | Production / managed              | microVM                    |
| **AWS Lambda MicroVMs**                   | `lambda-microvm`           | Production on AWS (self-hosted)   | microVM                    |
| **Azure Container Apps dynamic sessions** | `azure-container-sessions` | Production on Azure (self-hosted) | Hyper-V-isolated container |
| **Local Docker**                          | `docker`                   | Local development only            | container                  |

**E2B**, **AWS Lambda MicroVMs**, and **Azure Container Apps dynamic sessions** are all
supported production backends. E2B is the managed default; the AWS and Azure providers are
for teams who want sandboxes to run inside their own cloud account. More providers
(Kubernetes, ECS) are planned.

<Warning>
  The **local Docker provider is for development only**. It launches plain Docker containers
  via the Docker socket, which is root-equivalent on the host and provides no real isolation
  between the sandbox and your machine. It **refuses to start when `NODE_ENV=production`**.
  Do not use it for a production deployment.
</Warning>

## E2B (production default)

[E2B](https://e2b.dev) runs each sandbox as a Firecracker microVM in E2B's cloud. It's the
default — if you don't set `SANDBOX_PROVIDER`, Lightdash uses E2B.

To use it you need an E2B account and API key, and the agent needs an Anthropic API key:

```bash theme={null}
SANDBOX_PROVIDER=e2b            # default, can be omitted
E2B_API_KEY=e2b_...            # from your E2B dashboard
ANTHROPIC_API_KEY=sk-ant-...   # the agent (Claude Code) runs inside the sandbox
```

The sandbox images are E2B *templates*. Lightdash uses separate templates for data apps and
for AI writeback so they can be pinned or rolled back independently. These default to the
published Lightdash templates and rarely need to be set:

```bash theme={null}
E2B_TEMPLATE_NAME=lightdash/lightdash-data-app
E2B_TEMPLATE_TAG=<lightdash-version>             # defaults to your Lightdash version
E2B_AI_WRITEBACK_TEMPLATE_NAME=lightdash/lightdash-ai-writeback
E2B_AI_WRITEBACK_TEMPLATE_TAG=<lightdash-version>
```

## AWS Lambda MicroVMs (self-hosted production)

AWS Lambda MicroVMs run each sandbox as a Firecracker microVM **inside your own AWS
account**, so untrusted agent code and your repository contents never leave your
infrastructure. The microVMs have **no public IP** — your backend reaches each one
through an AWS-managed endpoint that requires a short-lived per-microVM token — and you
control their outbound network access (see [Networking and IAM](#networking-and-iam)).

This is the **recommended sandbox provider for customers deploying Lightdash on AWS** —
it keeps the sandbox boundary inside your existing AWS account and avoids sending agent
workloads or repository contents to a third-party service.

### Prerequisites

Provision these with your own IaC, in the **same AWS account and region your Lightdash
backend already runs in**:

* **Two MicroVM images** — one for data app generation and one for AI writeback (they
  bundle different toolchains). Build them from the Dockerfiles in the Lightdash repo
  (`sandboxes/data-apps/`, `sandboxes/ai-writeback/`, and the exec agent in
  `sandboxes/microvm-agent/`), push them to ECR, and register each as a Lambda MicroVM
  image on the AWS-managed `al2023` base — `ARM_64`, 4 GB memory, with the agent's
  `/ready` hook on port 8080. Each registration returns an **image ARN** for the config
  below.
* **Control-plane permissions on your backend's existing IAM role** — add `RunMicrovm`,
  `GetMicrovm`, `SuspendMicrovm`, `ResumeMicrovm`, `TerminateMicrovm`, and
  `CreateMicrovmAuthToken`.

<Note>
  Registering an image uses AWS's `create-microvm-image`, which needs a build role (trusting
  `lambda.amazonaws.com`) and an S3 location to stage the build context. These are
  **build-time only** — not the S3 bucket Lightdash already uses for results and snapshots,
  and the backend never touches them at runtime.
</Note>

### Configure the provider

```bash theme={null}
SANDBOX_PROVIDER=lambda-microvm
ANTHROPIC_API_KEY=sk-ant-...   # the agent (Claude Code) runs inside the microVM

# Region the microVMs run in (defaults to eu-west-1, the EU launch region)
LAMBDA_MICROVM_REGION=eu-west-1

# The image ARNs from registering the two MicroVM images above. Required.
LAMBDA_MICROVM_DATA_APP_IMAGE_ARN=arn:aws:lambda:<region>:<account>:microvm-image/...
LAMBDA_MICROVM_AI_WRITEBACK_IMAGE_ARN=arn:aws:lambda:<region>:<account>:microvm-image/...
```

The backend uses its ambient AWS credentials (instance role / IRSA / standard SDK
credential chain) to call the Lambda MicroVMs control plane, so no access keys are
configured here.

### Networking and IAM

<Warning>
  **Configure the network connectors before going to production.** The AWS-managed
  defaults give the microVM open inbound and outbound access, which means untrusted
  agent code can reach the public internet from inside your AWS account. We currently
  recommend pointing the egress connector at a VPC connector that has **no outbound
  access by default**, and only opening up the destinations the agent actually needs
  (your dbt repository host, the Anthropic / Bedrock API, your ECR registry).
</Warning>

Override these to tighten the network boundary or to give the microVM an IAM role:

```bash theme={null}
# IAM role the microVM assumes (e.g. to pull from a private ECR or reach AWS APIs)
LAMBDA_MICROVM_EXECUTION_ROLE_ARN=arn:aws:iam::...:role/lightdash-sandbox

# Network connectors. Default to AWS-managed open ingress/egress; point these at
# your own VPC connectors to constrain traffic. We strongly recommend an egress
# connector that denies outbound traffic by default.
LAMBDA_MICROVM_INGRESS_CONNECTOR_ARN=arn:aws:lambda:<region>:aws:network-connector:aws-network-connector:ALL_INGRESS
LAMBDA_MICROVM_EGRESS_CONNECTOR_ARN=arn:aws:lambda:<region>:aws:network-connector:aws-network-connector:INTERNET_EGRESS
```

## Azure Container Apps dynamic sessions (self-hosted production)

Azure Container Apps [dynamic sessions](https://learn.microsoft.com/azure/container-apps/sessions)
run each sandbox as a Hyper-V-isolated custom container **inside your own Azure subscription**,
so untrusted agent code and your repository contents never leave your infrastructure. Sessions
are allocated on demand from a warm pool and reached through the pool's management endpoint —
your backend authenticates to it with a short-lived Microsoft Entra token, so no session ever
has a public, unauthenticated ingress.

This is the **recommended sandbox provider for customers deploying Lightdash on Azure** —
it keeps the sandbox boundary inside your existing Azure subscription and avoids sending agent
workloads or repository contents to a third-party service. It pairs naturally with AKS via
[workload identity](https://learn.microsoft.com/azure/aks/workload-identity-overview).

<Note>
  Dynamic sessions have no native memory snapshot, so this provider suspends a sandbox by tarring
  its workspace to **S3-compatible object storage** (the same bucket Lightdash already uses) and
  destroying the session, then restores it on the next turn. Object storage is therefore required —
  see [external object storage](/self-host/customize-deployment/configure-lightdash-to-use-external-object-storage).
</Note>

### Prerequisites

Provision these with your own IaC, in the **same Azure subscription your Lightdash backend
runs in**:

* **Two session-pool container images** — one for data app generation and one for AI writeback
  (they bundle different toolchains). Build them from the Dockerfiles in the Lightdash repo
  (`sandboxes/data-apps/`, `sandboxes/ai-writeback/`, and the exec agent in
  `sandboxes/microvm-agent/`) for `linux/amd64`, and push them to a registry the pool can pull
  (e.g. Azure Container Registry).
* **A workload-profile Container Apps environment**, and **two custom-container dynamic
  session pools** in it (one per image), each with its **ingress target port set to `8080`**
  (the exec agent's port). Custom-container sessions require a workload-profile environment —
  a Consumption-only environment is rejected. Each pool exposes a **pool management endpoint**
  for the config below.
* **A managed identity** assigned to each pool, granted the built-in **Azure ContainerApps
  Session Executor** role on the pool. Your backend authenticates as this identity (via workload
  identity on AKS, or any credential `DefaultAzureCredential` resolves) to allocate and drive
  sessions — so no client secret is configured.

### Configure the provider

```bash theme={null}
SANDBOX_PROVIDER=azure-container-sessions
ANTHROPIC_API_KEY=sk-ant-...   # the agent (Claude Code) runs inside the session

# The pool management endpoints for the two session pools above. Required.
AZURE_CONTAINER_SESSIONS_DATA_APP_POOL_ENDPOINT=https://<pool>.<env>.<region>.azurecontainerapps.io
AZURE_CONTAINER_SESSIONS_AI_WRITEBACK_POOL_ENDPOINT=https://<pool>.<env>.<region>.azurecontainerapps.io
```

The backend authenticates to the dynamic-sessions data plane with `DefaultAzureCredential`, so
no keys or secrets are set here — grant the backend's identity the **Azure ContainerApps Session
Executor** role on each pool instead. In AKS, this is a user-assigned managed identity federated
to the backend's service account via workload identity.

<Note>
  Both `AZURE_CONTAINER_SESSIONS_API_VERSION` (dynamic-sessions data-plane API version) and
  `AZURE_CONTAINER_SESSIONS_TOKEN_SCOPE` (the Entra token scope) have sensible defaults and
  rarely need to be set.
</Note>

### Networking

Constrain a session's outbound access on the **pool's own configuration** (its egress settings /
the environment's network), not in Lightdash. As with any sandbox provider, we recommend denying
outbound traffic by default and only allowing the destinations the agent needs (your dbt
repository host, the Anthropic / Azure OpenAI API, your container registry).

## Local Docker provider (development)

For local development you can run sandboxes as plain Docker containers on your own machine
— no E2B account required. This is the recommended way to work on or try the AI features
locally.

It uses the same images E2B builds, but as plain local Docker images. Two **separate**
images are used (different toolchains), mirroring the two E2B templates:

| Image (default tag)            | Built from                | Used by             |
| ------------------------------ | ------------------------- | ------------------- |
| `lightdash-sandbox:local`      | `sandboxes/data-apps/`    | Data app generation |
| `lightdash-ai-writeback:local` | `sandboxes/ai-writeback/` | AI writeback        |

### Prerequisites

* Docker running locally, with the daemon reachable from the Lightdash backend.
* S3-compatible object storage configured (locally this is MinIO). Suspended-sandbox
  snapshots are tarred to object storage so a conversation survives the container being
  destroyed — see [external object storage](/self-host/customize-deployment/configure-lightdash-to-use-external-object-storage).
* An Anthropic API key (`ANTHROPIC_API_KEY`) for the agent.

### Setup

1. Build the local sandbox images (each builds from `sandboxes/<feature>/`):

   ```bash theme={null}
   ./sandboxes/data-apps/build-local-image.sh        # -> lightdash-sandbox:local
   ./sandboxes/ai-writeback/build-local-image.sh     # -> lightdash-ai-writeback:local
   ```

   These are large (the writeback image bundles dbt, the Lightdash CLI and Claude Code) and
   only need rebuilding when the sandbox toolchain changes.

2. Point Lightdash at the Docker provider:

   ```bash theme={null}
   SANDBOX_PROVIDER=docker
   # optional — these are the defaults:
   SANDBOX_DOCKER_IMAGE=lightdash-sandbox:local
   SANDBOX_AI_WRITEBACK_DOCKER_IMAGE=lightdash-ai-writeback:local
   ```

3. Restart the **backend and the scheduler** so both pick up the new environment. Data app
   generation runs in the scheduler worker, so a stale `SANDBOX_PROVIDER` there will keep it
   on E2B. (With PM2, a plain `restart` reuses the cached env — delete and re-start the
   processes, or restart with `--update-env`, to actually reload the env file.)

## Snapshot lifecycle

Every turn suspends its own sandbox, so in steady state nothing sits idle. For the
**Lambda MicroVMs** provider, two timers configure the AWS-side idle policy as a
backstop — when to auto-suspend a sandbox left running and when to auto-terminate a
suspended one:

```bash theme={null}
SANDBOX_IDLE_TIMEOUT_MS=1800000        # auto-suspend a running-but-idle microVM (default 30 min)
SANDBOX_SNAPSHOT_RETENTION_MS=604800000 # auto-terminate a suspended microVM (default 7 days); also how long a thread stays resumable
```

These are read only by the Lambda MicroVMs provider. **E2B** manages idle sandboxes
itself; **Azure Container Apps dynamic sessions** are reclaimed by the session pool's own
cooldown lifecycle (configured on the pool) and lose nothing, since the snapshot lives in
object storage; and the **Docker** dev provider has no idle handling.

## Environment variable reference

| Variable                                              | Default                               | Description                                                                                                                                                     |
| ----------------------------------------------------- | ------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `SANDBOX_PROVIDER`                                    | `e2b`                                 | Sandbox backend: `e2b`, `lambda-microvm`, `azure-container-sessions`, or `docker`.                                                                              |
| `ANTHROPIC_API_KEY`                                   | —                                     | API key for the Claude Code agent running inside the sandbox.                                                                                                   |
| `E2B_API_KEY`                                         | —                                     | E2B API key (required when `SANDBOX_PROVIDER=e2b`).                                                                                                             |
| `E2B_TEMPLATE_NAME`                                   | `lightdash/lightdash-data-app`        | E2B template for data app sandboxes.                                                                                                                            |
| `E2B_TEMPLATE_TAG`                                    | Lightdash version                     | Tag of the data app template to launch.                                                                                                                         |
| `E2B_AI_WRITEBACK_TEMPLATE_NAME`                      | `lightdash/lightdash-ai-writeback`    | E2B template for writeback sandboxes.                                                                                                                           |
| `E2B_AI_WRITEBACK_TEMPLATE_TAG`                       | Lightdash version                     | Tag of the writeback template to launch.                                                                                                                        |
| `LAMBDA_MICROVM_REGION`                               | `eu-west-1`                           | AWS region the microVMs run in (`lambda-microvm`).                                                                                                              |
| `LAMBDA_MICROVM_DATA_APP_IMAGE_ARN`                   | —                                     | Image ARN for data app microVMs (required when `SANDBOX_PROVIDER=lambda-microvm`).                                                                              |
| `LAMBDA_MICROVM_AI_WRITEBACK_IMAGE_ARN`               | —                                     | Image ARN for writeback microVMs (required when `SANDBOX_PROVIDER=lambda-microvm`).                                                                             |
| `LAMBDA_MICROVM_EXECUTION_ROLE_ARN`                   | —                                     | Optional IAM role the microVM assumes.                                                                                                                          |
| `LAMBDA_MICROVM_INGRESS_CONNECTOR_ARN`                | AWS-managed `ALL_INGRESS`             | Optional ingress network connector.                                                                                                                             |
| `LAMBDA_MICROVM_EGRESS_CONNECTOR_ARN`                 | AWS-managed `INTERNET_EGRESS`         | Optional egress network connector.                                                                                                                              |
| `AZURE_CONTAINER_SESSIONS_DATA_APP_POOL_ENDPOINT`     | —                                     | Pool management endpoint for data app sessions (required when `SANDBOX_PROVIDER=azure-container-sessions`).                                                     |
| `AZURE_CONTAINER_SESSIONS_AI_WRITEBACK_POOL_ENDPOINT` | —                                     | Pool management endpoint for writeback sessions (required when `SANDBOX_PROVIDER=azure-container-sessions`).                                                    |
| `AZURE_CONTAINER_SESSIONS_API_VERSION`                | `2025-02-02-preview`                  | Dynamic-sessions data-plane API version.                                                                                                                        |
| `AZURE_CONTAINER_SESSIONS_TOKEN_SCOPE`                | `https://dynamicsessions.io/.default` | Microsoft Entra token scope for the dynamic-sessions data plane.                                                                                                |
| `SANDBOX_DOCKER_IMAGE`                                | `lightdash-sandbox:local`             | Local image for data app sandboxes (`docker` provider).                                                                                                         |
| `SANDBOX_AI_WRITEBACK_DOCKER_IMAGE`                   | `lightdash-ai-writeback:local`        | Local image for writeback sandboxes (`docker` provider).                                                                                                        |
| `SANDBOX_IDLE_TIMEOUT_MS`                             | `1800000` (30 min)                    | Lambda MicroVMs idle policy: auto-suspend a running-but-idle microVM. Ignored by `e2b`/`azure-container-sessions`/`docker`.                                     |
| `SANDBOX_SNAPSHOT_RETENTION_MS`                       | `604800000` (7 days)                  | Lambda MicroVMs idle policy: auto-terminate a suspended microVM (also how long a thread stays resumable). Ignored by `e2b`/`azure-container-sessions`/`docker`. |
