Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.crewship.ai/llms.txt

Use this file to discover all available pages before exploring further.

Threat Model

Crewship’s security posture rests on four layers that compose into a defense-in-depth model. Each layer assumes the layer above it can fail. This page is the 1-page summary; deep-dive links for each layer are at the bottom.

Adversary

The threat model we design against is an agent that has been prompt-injected, jailbroken, or is otherwise misaligned — an attacker has supplied input the agent treats as instructions, and we have to assume the agent will execute that input verbatim against whatever tools it has access to. The attacker is not assumed to have compromised the host machine, the operator’s terminal, or the Crewship binary itself. In practice this means we trust:
  • The host kernel and container runtime.
  • The Crewship binary and sidecar binary on disk. The curl | sh installer verifies the SHA-256 against a signed checksums file and — when cosign is on PATH — also verifies the Sigstore signature against the project’s release workflow identity. Homebrew and Docker Compose installs rely on the upstream channel’s own integrity; operators in regulated environments should run cosign verification manually. See Install.
  • The operator running crewship.
We do not trust:
  • Anything the agent reads — system prompt, user prompt, tool output, fetched documents, MCP server responses.
  • The agent’s own output, until the scrubber has run over it.
  • Network destinations the agent wants to reach.

Layer 1 — Container isolation

Every agent runs in its own Linux container. The defaults applied in internal/provider/docker/docker_container.go:
  • UID 1001 (agent) for the agent process; UID 1002 for the per-agent sidecar.
  • CapDrop: ["ALL"] as the starting capability set.
  • CapAdd: ["NET_RAW"] is added back so the agent can use ping and other ICMP-shaped probes inside its sandbox. This is the one capability that survives the drop.
  • no-new-privileges security opt + read-only root filesystem
    • PID limit + memory + CPU quotas + tmpfs /tmp with a 500 MB cap.
  • No Docker socket bind-mounted into the agent container. The agent cannot launch sibling containers, mount the host docker.sock, or enumerate peers.
  • Mounts: the workspace directory read/write and /output/<agent>/ for any persistent files; nothing else by default.
  • NetworkMode is instance-wide, set from the server config (p.cfg.Network) — not per-agent. Operators who need per-agent network isolation must build it at the host network layer (e.g. per-crew Docker networks) rather than rely on the runtime default.

Privileged-mode escape hatch

A crew can be flagged Privileged: true in its security profile — needed for DinD or anything requiring SYS_ADMIN. When that flag is set the runtime disables no-new-privileges and drops ReadonlyRootfs because Docker rejects both under privileged mode. CapDrop: ALL still applies on paper, but with Privileged=true the container has full root-equivalent host access regardless. Treat a privileged crew as outside this threat model — its isolation guarantees are whatever the host runtime provides.

Blast radius if the agent escapes

If a non-privileged agent breaks out of the container runtime it gets the host process namespace as UID 1001, with CapDrop: ALL + NET_RAW capabilities, no Docker socket, read-only root, and the quotas above. This is the standard rootless-container failure mode; operators are expected to layer host hardening (SELinux, AppArmor, seccomp, kernel patching) on top. Detail: Container isolation.

Layer 2 — Sidecar credential broker

The agent never sees raw credentials. Instead:
  1. Credentials live in the host’s encrypted store (internal/sidecar/credstore) encrypted with AES-256-GCM using the operator’s ENCRYPTION_KEY.
  2. A per-agent sidecar runs alongside the agent container, also as a non-root user (UID 1002).
  3. The agent’s outbound HTTP traffic is proxied through the sidecar. When the agent calls https://api.openai.com/..., the sidecar:
    • Checks the destination against the per-agent allowlist.
    • Looks up the credential mapped to that destination.
    • Injects the Authorization header on the way out.
    • Forwards the response back.
  4. The sidecar’s credentials are loaded once at startup over stdin as a base64-encoded JSON payload. They are never written to the agent container’s environment.
The agent therefore cannot:
  • Read API keys via printenv, cat /proc/1/environ, or any other env-var introspection — credentials don’t live in the agent’s env.
  • Reach an HTTP destination not in its allowlist as long as it uses the HTTP_PROXY / HTTPS_PROXY env vars that orchestrator injects (http://127.0.0.1:9119). Well-behaved Go/Python/Node HTTP clients honor this; clients that ignore proxy env vars (or that open raw sockets) can bypass it. Hard egress filtering at the network layer is the operator’s responsibility — see Container isolation for hardening options.
  • Exfiltrate credentials by tricking the LLM into echoing them — the scrubber (Layer 3) drops anything that looks like a key in the agent’s output before it lands in the journal or web UI.
Detail: Credentials guide for the operator flow, internal/sidecar/ for the source.

Layer 3 — Output scrubber

Every byte the agent produces — chat messages, journal entries, tool input/output, status updates streamed over WebSocket — passes through internal/scrubber/ before it is persisted or shown to a human. The scrubber owns 17 credential-shaped patterns today:
  • OpenSSH private-key blocks
  • PEM-encoded private-key blocks (-----BEGIN [RSA|EC|DSA|ED25519] PRIVATE KEY-----)
  • Anthropic keys (sk-ant-...)
  • OpenAI keys (sk-..., sk-proj-..., sk-svcacct-...)
  • Google service-account / API keys (AIza...)
  • Cursor API keys (cur_...)
  • Factory tokens (fact_..., factory_...)
  • OpenRouter keys (sk-or-...)
  • xAI keys (xai-...)
  • Groq keys (gsk_...)
  • GitHub PATs (ghp_, gho_, ghs_, ghr_, github_pat_)
  • GitLab PATs (glpat-...)
  • Slack tokens (xoxb-, xoxp-, xoxa-, xoxr-)
  • AWS access keys (AKIA...)
  • JWT bearer tokens (Bearer eyJ...)
  • JSON password|secret|token|api_key|apikey|secret_key field values
  • PASSWORD|SECRET|API_KEY|... shell-style env assignments
The list grows as the maintainer encounters new shapes in the wild. A scrub hit replaces the matched substring with a [REDACTED:<type>] marker. The operator can search journal entries for the marker to investigate which agent tried to echo what.

Layer 4 — Keeper policy layer

Even with credentials brokered and output scrubbed, the agent can still take legitimate-looking actions the operator wants to gate: opening a high-risk file, hitting a paid API endpoint, calling a tool that modifies production data. Keeper is the per-workspace policy engine that evaluates each tool call against a YAML ruleset before the call is dispatched. Rules can:
  • Allow — the call proceeds (default for everything not matched).
  • Deny — the call is blocked, the agent receives a structured error, the decision is journalled.
  • Require approval — the call pauses, a waitpoint lands in the operator’s Inbox, and the agent resumes (or is denied) on the operator’s choice.
  • Escalate — the call routes to a second-opinion agent (“gatekeeper LLM”) for a softer decision, with the final answer captured in the journal.
Keeper decisions are first-class journal entries (entry_type = keeper.decision) and are searchable in Recall, so a postmortem can reconstruct exactly what the agent tried and why the policy held the line. Detail: Keeper guide.

What this composition does and does not protect against

ThreatLayer that handles it
Agent tries to read API keys from envLayer 2 (sidecar — keys aren’t there)
Agent tries to echo a stolen key in outputLayer 3 (scrubber redacts)
Agent tries to reach an arbitrary URLLayer 2 (allowlist denies) + container network policy
Agent tries to call a sensitive internal APILayer 4 (Keeper rule)
Agent breaks out of container runtimeLayer 1 — host hardening
Operator’s ENCRYPTION_KEY is exfiltratedOut of scope — host compromise
Crewship binary is tampered with at installPartially mitigatedcurl | sh checks SHA-256, and cosign signature when cosign is on PATH; Homebrew + Docker Compose paths rely on upstream channel integrity. See Install
Supply-chain attack on sentry-go / a depMulti-layer — go.sum + pnpm-lock.yaml pinning; Renovate / Dependabot review; govulncheck (Go source call-graph analysis on cmd/crewship, fails only on reachable vulns); osv-scanner (PR #441, cross-refs both lockfiles against the full OSV.dev DB, fails on anything in the dep graph regardless of reachability — catches npm-chain issues govulncheck never sees); CodeQL + Grype (PR #445) on every PR
Side-channel timing leak across agentsNot modelled — agents are not assumed co-tenant-isolated below the container-runtime layer

Beta posture and known gaps

Crewship v0.1 beta is the first public beta. The threat model above describes the current implementation, with these gaps:
  • Egress is HTTP_PROXY-style, not network-enforced. Agents that bypass HTTP_PROXY or open raw sockets exit the allowlist unchecked. Operators who need hard egress control must layer Docker network policies (--internal, custom networks, egress firewall) on top.
  • NET_RAW is added back so the agent can ping. Acceptable for the convenience trade-off, but operators with stricter requirements can drop it in their crew security profile.
  • Privileged crews bypass most of Layer 1. Anything flagged Privileged: true is outside this threat model.
  • Container-runtime isolation is the host’s responsibility. The default Docker config is not rootless, and Apple Containers / Colima have different threat profiles than upstream Docker.
  • Keeper rulesets ship empty. Operators write the rules that matter for their environment; there is no “secure by default” policy yet.

Reporting issues

Security issues go to github.com/crewship-ai/crewship/security/advisories/new (private disclosure). For ambiguous reports, the regular issue tracker with the security label is also fine — the maintainer triages both.