🛠 This page is for engineering teams self-hosting their own Lightdash instance. On Lightdash Cloud, sandboxes are fully managed for you — there’s nothing to configure.
What sandboxes are for
Some Lightdash features use an AI agent (Claude Code) that writes and runs code on your behalf. Today that’s:- AI writeback — the agent edits your dbt project (e.g. adds a metric or dimension),
runs
lightdash compileto validate it, and opens a pull request. - Data app generation — the agent generates and builds a small web app from a prompt.
Sandbox providers
The sandbox backend is pluggable. Lightdash talks to a provider-neutral interface, so the same feature code runs on whichever backend your deployment is configured for. You select the provider with theSANDBOX_PROVIDER environment variable.
| Provider | SANDBOX_PROVIDER | Use for | Isolation |
|---|---|---|---|
| E2B (default) | e2b | Production / managed | microVM |
| AWS Lambda MicroVMs | lambda-microvm | Production on AWS (self-hosted) | microVM |
| Azure Container Apps dynamic sessions | azure-container-sessions | Production on Azure (self-hosted) | Hyper-V-isolated container |
| Local Docker | docker | Local development only | container |
E2B (production default)
E2B runs each sandbox as a Firecracker microVM in E2B’s cloud. It’s the default — if you don’t setSANDBOX_PROVIDER, Lightdash uses E2B.
To use it you need an E2B account and API key, and the agent needs an Anthropic API key:
AWS Lambda MicroVMs (self-hosted production)
AWS Lambda MicroVMs run each sandbox as a Firecracker microVM inside your own AWS account, so untrusted agent code and your repository contents never leave your infrastructure. The microVMs have no public IP — your backend reaches each one through an AWS-managed endpoint that requires a short-lived per-microVM token — and you control their outbound network access (see Networking and IAM). This is the recommended sandbox provider for customers deploying Lightdash on AWS — it keeps the sandbox boundary inside your existing AWS account and avoids sending agent workloads or repository contents to a third-party service.Prerequisites
Provision these with your own IaC, in the same AWS account and region your Lightdash backend already runs in:- Two MicroVM images — one for data app generation and one for AI writeback (they
bundle different toolchains). Build them from the Dockerfiles in the Lightdash repo
(
sandboxes/data-apps/,sandboxes/ai-writeback/, and the exec agent insandboxes/microvm-agent/), push them to ECR, and register each as a Lambda MicroVM image on the AWS-managedal2023base —ARM_64, 4 GB memory, with the agent’s/readyhook on port 8080. Each registration returns an image ARN for the config below. - Control-plane permissions on your backend’s existing IAM role — add
RunMicrovm,GetMicrovm,SuspendMicrovm,ResumeMicrovm,TerminateMicrovm, andCreateMicrovmAuthToken.
Registering an image uses AWS’s
create-microvm-image, which needs a build role (trusting
lambda.amazonaws.com) and an S3 location to stage the build context. These are
build-time only — not the S3 bucket Lightdash already uses for results and snapshots,
and the backend never touches them at runtime.Configure the provider
Networking and IAM
Override these to tighten the network boundary or to give the microVM an IAM role:Azure Container Apps dynamic sessions (self-hosted production)
Azure Container Apps dynamic sessions run each sandbox as a Hyper-V-isolated custom container inside your own Azure subscription, so untrusted agent code and your repository contents never leave your infrastructure. Sessions are allocated on demand from a warm pool and reached through the pool’s management endpoint — your backend authenticates to it with a short-lived Microsoft Entra token, so no session ever has a public, unauthenticated ingress. This is the recommended sandbox provider for customers deploying Lightdash on Azure — it keeps the sandbox boundary inside your existing Azure subscription and avoids sending agent workloads or repository contents to a third-party service. It pairs naturally with AKS via workload identity.Dynamic sessions have no native memory snapshot, so this provider suspends a sandbox by tarring
its workspace to S3-compatible object storage (the same bucket Lightdash already uses) and
destroying the session, then restores it on the next turn. Object storage is therefore required —
see external object storage.
Prerequisites
Provision these with your own IaC, in the same Azure subscription your Lightdash backend runs in:- Two session-pool container images — one for data app generation and one for AI writeback
(they bundle different toolchains). Build them from the Dockerfiles in the Lightdash repo
(
sandboxes/data-apps/,sandboxes/ai-writeback/, and the exec agent insandboxes/microvm-agent/) forlinux/amd64, and push them to a registry the pool can pull (e.g. Azure Container Registry). - A workload-profile Container Apps environment, and two custom-container dynamic
session pools in it (one per image), each with its ingress target port set to
8080(the exec agent’s port). Custom-container sessions require a workload-profile environment — a Consumption-only environment is rejected. Each pool exposes a pool management endpoint for the config below. - A managed identity assigned to each pool, granted the built-in Azure ContainerApps
Session Executor role on the pool. Your backend authenticates as this identity (via workload
identity on AKS, or any credential
DefaultAzureCredentialresolves) to allocate and drive sessions — so no client secret is configured.
Configure the provider
DefaultAzureCredential, so
no keys or secrets are set here — grant the backend’s identity the Azure ContainerApps Session
Executor role on each pool instead. In AKS, this is a user-assigned managed identity federated
to the backend’s service account via workload identity.
Both
AZURE_CONTAINER_SESSIONS_API_VERSION (dynamic-sessions data-plane API version) and
AZURE_CONTAINER_SESSIONS_TOKEN_SCOPE (the Entra token scope) have sensible defaults and
rarely need to be set.Networking
Constrain a session’s outbound access on the pool’s own configuration (its egress settings / the environment’s network), not in Lightdash. As with any sandbox provider, we recommend denying outbound traffic by default and only allowing the destinations the agent needs (your dbt repository host, the Anthropic / Azure OpenAI API, your container registry).Local Docker provider (development)
For local development you can run sandboxes as plain Docker containers on your own machine — no E2B account required. This is the recommended way to work on or try the AI features locally. It uses the same images E2B builds, but as plain local Docker images. Two separate images are used (different toolchains), mirroring the two E2B templates:| Image (default tag) | Built from | Used by |
|---|---|---|
lightdash-sandbox:local | sandboxes/data-apps/ | Data app generation |
lightdash-ai-writeback:local | sandboxes/ai-writeback/ | AI writeback |
Prerequisites
- Docker running locally, with the daemon reachable from the Lightdash backend.
- S3-compatible object storage configured (locally this is MinIO). Suspended-sandbox snapshots are tarred to object storage so a conversation survives the container being destroyed — see external object storage.
- An Anthropic API key (
ANTHROPIC_API_KEY) for the agent.
Setup
-
Build the local sandbox images (each builds from
sandboxes/<feature>/):These are large (the writeback image bundles dbt, the Lightdash CLI and Claude Code) and only need rebuilding when the sandbox toolchain changes. -
Point Lightdash at the Docker provider:
-
Restart the backend and the scheduler so both pick up the new environment. Data app
generation runs in the scheduler worker, so a stale
SANDBOX_PROVIDERthere will keep it on E2B. (With PM2, a plainrestartreuses the cached env — delete and re-start the processes, or restart with--update-env, to actually reload the env file.)
Snapshot lifecycle
Every turn suspends its own sandbox, so in steady state nothing sits idle. For the Lambda MicroVMs provider, two timers configure the AWS-side idle policy as a backstop — when to auto-suspend a sandbox left running and when to auto-terminate a suspended one:Environment variable reference
| Variable | Default | Description |
|---|---|---|
SANDBOX_PROVIDER | e2b | Sandbox backend: e2b, lambda-microvm, azure-container-sessions, or docker. |
ANTHROPIC_API_KEY | — | API key for the Claude Code agent running inside the sandbox. |
E2B_API_KEY | — | E2B API key (required when SANDBOX_PROVIDER=e2b). |
E2B_TEMPLATE_NAME | lightdash/lightdash-data-app | E2B template for data app sandboxes. |
E2B_TEMPLATE_TAG | Lightdash version | Tag of the data app template to launch. |
E2B_AI_WRITEBACK_TEMPLATE_NAME | lightdash/lightdash-ai-writeback | E2B template for writeback sandboxes. |
E2B_AI_WRITEBACK_TEMPLATE_TAG | Lightdash version | Tag of the writeback template to launch. |
LAMBDA_MICROVM_REGION | eu-west-1 | AWS region the microVMs run in (lambda-microvm). |
LAMBDA_MICROVM_DATA_APP_IMAGE_ARN | — | Image ARN for data app microVMs (required when SANDBOX_PROVIDER=lambda-microvm). |
LAMBDA_MICROVM_AI_WRITEBACK_IMAGE_ARN | — | Image ARN for writeback microVMs (required when SANDBOX_PROVIDER=lambda-microvm). |
LAMBDA_MICROVM_EXECUTION_ROLE_ARN | — | Optional IAM role the microVM assumes. |
LAMBDA_MICROVM_INGRESS_CONNECTOR_ARN | AWS-managed ALL_INGRESS | Optional ingress network connector. |
LAMBDA_MICROVM_EGRESS_CONNECTOR_ARN | AWS-managed INTERNET_EGRESS | Optional egress network connector. |
AZURE_CONTAINER_SESSIONS_DATA_APP_POOL_ENDPOINT | — | Pool management endpoint for data app sessions (required when SANDBOX_PROVIDER=azure-container-sessions). |
AZURE_CONTAINER_SESSIONS_AI_WRITEBACK_POOL_ENDPOINT | — | Pool management endpoint for writeback sessions (required when SANDBOX_PROVIDER=azure-container-sessions). |
AZURE_CONTAINER_SESSIONS_API_VERSION | 2025-02-02-preview | Dynamic-sessions data-plane API version. |
AZURE_CONTAINER_SESSIONS_TOKEN_SCOPE | https://dynamicsessions.io/.default | Microsoft Entra token scope for the dynamic-sessions data plane. |
SANDBOX_DOCKER_IMAGE | lightdash-sandbox:local | Local image for data app sandboxes (docker provider). |
SANDBOX_AI_WRITEBACK_DOCKER_IMAGE | lightdash-ai-writeback:local | Local image for writeback sandboxes (docker provider). |
SANDBOX_IDLE_TIMEOUT_MS | 1800000 (30 min) | Lambda MicroVMs idle policy: auto-suspend a running-but-idle microVM. Ignored by e2b/azure-container-sessions/docker. |
SANDBOX_SNAPSHOT_RETENTION_MS | 604800000 (7 days) | Lambda MicroVMs idle policy: auto-terminate a suspended microVM (also how long a thread stays resumable). Ignored by e2b/azure-container-sessions/docker. |