Documentation Index
Fetch the complete documentation index at: https://docs.crewship.ai/llms.txt
Use this file to discover all available pages before exploring further.
Devcontainers & Runtime Images
Crewship used to ship a single monolithicghcr.io/crewship-ai/agent-runtime image with every CLI, tool, and the sidecar baked in. That image has been retired (commit dd86356). Any glibc-based Linux base image now works — Crewship provisions the crew-specific tooling on top via devcontainer features and bind-mounts the sidecar from the host.
This guide explains the whole pipeline end-to-end, from setting a base image on a crew to running agents inside the cached per-crew image.
What you get
Bring Your Own Image (BYOI)
debian:bookworm-slim, ubuntu:24.04, mcr.microsoft.com/devcontainers/base:bookworm, or any other glibc Linux image. Musl/Alpine is unsupported (the Go sidecar binary is CGO-free but still glibc-linked).Community Features
installsAfter).mise Runtime Tools
mise.json block — no Dockerfile required.Per-Crew Cached Image
crewship-cache:{hash[:12]}. Re-running agents skips the install phase entirely.Mental model
Prerequisites
Build the sidecar on the host
internal/config/config.go searches:{binDir}/crewship-sidecar(next to thecrewshipbinary)/usr/local/bin/crewship-sidecar
{binDir}/entrypoint.sh{cwd}/scripts/entrypoint.sh{cwd}/entrypoint.sh/usr/local/share/crewship/entrypoint.sh
CREWSHIP_SIDECAR_PATH / CREWSHIP_ENTRYPOINT_PATH.Verify fail-fast is in effect
crewship start exits with:Configuration shape
Every crew stores three optional fields in thecrews table:
| Field | Type | Purpose |
|---|---|---|
runtime_image | string | Base OCI reference; default debian:bookworm-slim. Seed demos use mcr.microsoft.com/devcontainers/base:bookworm. |
devcontainer_config | JSON string | Subset of the devcontainer.json spec: image, features, postCreateCommand, containerEnv, remoteUser. |
mise_config | JSON string | mise config.toml in JSON form: {"tools":{"node":"22","python":"3.12"}}. |
Supported devcontainer.json fields
Parser lives in internal/devcontainer/config.go:
postCreateCommand accepts the polymorphic forms from the spec: a single string, an array of strings (run sequentially), or an object of named commands (keys ignored, values executed). See parsePostCreateCommand in internal/devcontainer/config.go.mise config
agent user (UID 1001), so tools land under /home/agent/.local/share/mise. This survives cache rebuilds because the cache image freezes the post-mise state.
Managing a crew’s runtime config
- UI (Crew Wizard)
- CLI
- REST API
- Base image dropdown (populated from
/api/v1/runtimes/catalog) - Feature picker (searchable, powered by
/api/v1/features/catalog) - mise tool picker (searchable, same
/api/v1/runtimes/catalogpayload) - A side-by-side preview of the generated
devcontainer.jsonandmise.jsonblobs
containers.dev scrape, TTL 24 h.Triggering provisioning
Provisioning is asynchronous. The CLI command returns immediately after the backend enqueues a job; you then poll for status. Three triggers can put a crew into provisioning state:- Explicit CLI/API trigger —
crewship crew provision <slug>orPOST /api/v1/crews/{crewId}/provision. - Crew rebuild —
crewship crew rebuild <slug>clears the cached image marker and re-runs provision. - Auto-provision on first chat (PR #230) — when an operator opens a chat with an agent in a crew that has a config but no cached image, the chat handler calls
ProvisioningHandler.EnqueueForCrewdirectly. The chat surface renders an inlineCrewProvisioningCardwith the live progress checklist; once provisioning succeeds, the agent’s first message starts on the freshly cached image.
EnqueueForCrew API is idempotent: if a job is already running for the crew, the second call returns the existing job’s ID rather than starting a duplicate. Rate limiting is enforced — ErrRateLimited is returned as RFC 7807 Problem Details if a workspace is enqueueing more than the per-minute cap.
Trigger
POST /api/v1/crews/{crewId}/provision → ProvisioningHandler.ProvisionTrigger → spawns a background goroutine that calls provisioner.Provision(ctx, baseImage, cfg, miseConfig).Poll status
(baseImage, devcontainer_config, mise_config) — see configHash in internal/devcontainer/provisioner.go. Any change invalidates the cache.What Provision does, step by step
The entire pipeline is in internal/devcontainer/provisioner.go:Provision:
1. Compute config hash, check cache
1. Compute config hash, check cache
2. Skip if nothing to do
2. Skip if nothing to do
postCreateCommand, no containerEnv, and no mise tools, Provision returns CachedImage="" — the runtime launcher uses the bare RuntimeImage as-is.3. Pull base image if missing
3. Pull base image if missing
ensureImage runs ImageList then ImagePull if absent. The stream is drained to completion before proceeding — otherwise the next ContainerCreate fails with No such image.4. Create a temporary container
4. Create a temporary container
crewship-provision-{hash[:8]}-{unixnano}, entrypoint ["sleep", "infinity"], runs as root so install.sh scripts can write everywhere.5. Install devcontainer features
5. Install devcontainer features
feature: {} entry:- Resolve feature ID → OCI artifact (
ghcr.io/devcontainers/features/common-utils:2). - Fetch manifest, pull the single layer (media type
application/vnd.devcontainers.layer.v1+tar, raw tar — not gzipped). - Write the tar stream into
/tmp/devcontainer-features/{featureId}/inside the container viaCopyToContainer. - Exec
install.shwith the feature options injected as env vars (e.g.USERNAME=agent USERUID=1001 /tmp/devcontainer-features/common-utils/install.sh).
installsAfter (topological sort on feature IDs) — see features.go:SortFeatures. Legacy wild-form [{id: string}] metadata is accepted alongside the spec-compliant []string.6. Install mise tools (if declared)
6. Install mise tools (if declared)
curl -fsSL https://mise.run | sh as the agent user, then mise install for each {tool: version} entry. Fails the provisioning job if any tool download 404s or the version is invalid.7. Run postCreateCommand
7. Run postCreateCommand
agent) via docker exec, with stdout/stderr streamed to the server log. Non-zero exit aborts the whole provision.8. Clean up, commit, tag
8. Clean up, commit, tag
Running agents on the cached image
When a message arrives for a crew,chatbridge.Bridge calls container.EnsureCrewRuntime(ctx, team, ...) in internal/provider/docker/docker.go:
-
Resolve which image to boot:
- If
team.CachedImage != ""→ use the cached image. - Else → use
team.RuntimeImageor config default (debian:bookworm-slim).
- If
-
Ensure Docker network
crewship-agentsexists (Internal: false— containers need outbound HTTPS for LLM providers). -
ContainerCreatewith:- Entrypoint forced to
/usr/local/bin/entrypoint.sh(bind-mounted). - HostConfig:
CapDrop: ALL, CapAdd: NET_RAW, no-new-privileges, ReadonlyRootfs: true.- Memory / CPU /
PidsLimit: 200. ExtraHosts: ["host.docker.internal:host-gateway"]— lets the sidecar reachcrewshipdon Linux too.Mounts:/workspace,/output,/crew,/secrets(rw binds),/home/agent+/opt/crew-tools(named volumes), plus the two read-only bind mounts for the sidecar binary + entrypoint.
- Entrypoint forced to
-
ContainerStart. -
Sanity-check the sidecar bind mount for BYOI images:
Catches Alpine/musl base images that silently can’t run the glibc-linked sidecar.
-
From then on, agent execution uses
docker exec(neverdocker run).
Seed data: end-to-end demo
Thecrewship seed command seeds demo crews with sensible devcontainer + mise configs, then provisions all of them in parallel:
| Crew | Base image | Features | mise |
|---|---|---|---|
| Engineering | mcr.microsoft.com/devcontainers/base:bookworm | common-utils, claude-code, github-cli | node:22, python:3.12 |
| Quality | same | common-utils, claude-code, python:1 | — |
| DevOps | same | common-utils, claude-code, docker-in-docker, kubectl-helm-minikube | terraform:1.9 |
| Research | same | common-utils, claude-code, python:1 | python:3.12 |
Container actuals
devcontainer.json declares what the container should look like. After agents have run a session — installing packages with apt-get, pip, or npm — the container’s actual state usually drifts from the declared intent. PR #231 closes that gap with the container.snapshot journal entry.
After every successful agent exec, internal/containerstate.Snapshot runs four short probes inside the crew container:
| Probe | What it captures |
|---|---|
dpkg-query -W -f='${Package} ${Version}\n' | All apt-installed packages with versions |
pip freeze | Python packages globally and in venvs on $PATH |
npm ls -g --json | Globally installed Node packages |
cat /etc/os-release | OS family, version, and PRETTY_NAME |
pip in a Node-only image, no npm in a Python-only one) yield empty lists rather than errors. The snapshot is hashed (SHA-256 over the canonical-sorted package set + os details). The orchestrator emits a container.snapshot journal entry only when the hash changes — so quiet sessions that don’t mutate the container produce no churn at all.
A typical session that adds one Python dependency emits exactly one container.snapshot entry (the new pip line), which then survives compaction the same way other observability entries do — see Crew Journal — container.snapshot for the payload schema.
Operators can use these entries to diff intent vs reality:
- “What did this agent install last week?” — query
entry_type=container.snapshoton the crew, sort byts. - “What’s drift between today’s container and the cached image?” — compare the latest snapshot against the manifest baked at provision time.
Variable expansion in mounts and env
PR #225 extends the devcontainer expansion vocabulary the spec defines:${devcontainerId}— resolves to the cached image hash for this crew. Useful for binding host paths that are unique per provisioning run.${VAR}— resolved against the base image’s environment (everything in the image’sENVdirectives), not the host’s. So referencing${HOME}resolves to whatever the image set it to (e.g./home/agent), not the operator’s home directory.
mounts[*].source, containerEnv, and forwardPorts strings. Unknown variables fail provisioning loudly rather than silently expanding to empty — ${TYPO} will surface as an error in the provisioning log instead of producing a busted mount.
Runtime bind-mount semantics
crewship-sidecar+entrypoint.share bind-mounted read-only from the host into every crew container. Host-side edits do not take effect in already-running containers — the mount is snapshotted atdocker createtime.- To roll out a new sidecar or entrypoint across existing crews, rebuild the sidecar (
make build:sidecar) and then remove each crew’s container:docker rm -f crewship-team-<slug>. The container is recreated transparently on the next agent message, picking up the updated bind mount. postStartCommandruns on every start, so it is the right hook for operations that must reflect host state at the moment of launch (e.g. refreshing a secret viavault read). Do not try to bake those intopostCreateCommand— they’d freeze at provisioning time.- Cached images (
crewship-cache:{hash}) live on the Docker host indefinitely. Runcrewship crew cache prune --older-than 30das part of your maintenance rotation; cached images referenced by live crews are protected automatically.
Troubleshooting
sidecar binary not found; run 'make build:sidecar'
sidecar binary not found; run 'make build:sidecar'
autodetectSidecarPaths in internal/config/config.go failed. Run make build:sidecar on the host, or set CREWSHIP_SIDECAR_PATH=/absolute/path/to/crewship-sidecar and CREWSHIP_ENTRYPOINT_PATH=/absolute/path/to/entrypoint.sh. For tests that never launch containers, set CREWSHIP_SKIP_SIDECAR=1.sidecar bind mount sanity check failed (exit 126)
sidecar bind mount sanity check failed (exit 126)
feature 'xyz' install.sh exits non-zero
feature 'xyz' install.sh exits non-zero
curl, git, ca-certificates). debian:bookworm-slim is intentionally minimal. Either:- Switch to
mcr.microsoft.com/devcontainers/base:bookworm(pre-installed), or - Declare
ghcr.io/devcontainers/features/common-utils:2as the first feature — it installs prerequisites.
cache image never refreshes
cache image never refreshes
configHash is deterministic over (baseImage, devcontainer_config, mise_config). Upstream feature releases that re-use the same tag don’t change the hash. Force a rebuild:tmux: not found (noisy log on startup)
tmux: not found (noisy log on startup)
44da863: setupTmuxExec now pre-checks command -v tmux and falls back to stdbuf -oL <cmd> when absent. debian:bookworm-slim doesn’t ship tmux — this is expected.claude: not found inside agent container
claude: not found inside agent container
ghcr.io/devcontainers-extra/features/claude-code:2 feature (replaces our former Go EnsureClaudeCode helper). Make sure it’s in the crew’s devcontainer_config.features. Seed crews include it automatically.Migrating from the legacy agent-runtime image
If you have a long-running deployment that still references ghcr.io/crewship-ai/agent-runtime:latest:
- Update Crewship to the current release (post-
dd86356). make build:sidecaron the host.- For each existing crew:
- Restart
crewshipd. The fail-fast check will confirm your sidecar is discoverable.
agent-runtime.yml workflow deleted). Pulling its old tag still works but sits frozen; nothing in Crewship itself assumes it anymore.
Reference
- Configuration → Devcontainers — config fields + environment variables reference.
- Configuration → Environment Variables —
CREWSHIP_SIDECAR_PATH,CREWSHIP_ENTRYPOINT_PATH,CREWSHIP_SKIP_SIDECAR. - Security → Container Isolation — UID boundaries, capability drops, network policy.
- Code of record:
internal/devcontainer/,internal/provider/docker/docker.go,cmd/crewship/seeddata/crews.go,cmd/crewship/cmd_crew_config.go,cmd/crewship/cmd_crew_provision.go.