The operational surface for OWNER and ADMIN roles: workspace stats, user management, GDPR subject-access and erasure, memory administration, the Keeper security audit log, and a set of cross-cutting system and metrics endpoints. Most endpoints require the OWNER role; the GDPR and memory-admin endpoints require ADMIN or OWNER (manage permission). The Keeper Phase 2 evaluators and three system endpoints have their own auth model, noted inline.
Unless a section says otherwise, every endpoint on this page requires the OWNER role. The GDPR, memory-admin, and Keeper-requests endpoints accept ADMIN or OWNER (manage permission).
Subject-access export and right-to-erasure across the cascadable tables. Both require ADMIN or OWNER (manage permission) and write a gdpr_actions audit row.
The DELETE endpoint cascade-purges every row referencing the user across the cascadable tables and cannot be undone. A reason is required for the audit trail.
GDPR Art. 15 (Right of Access) — return every row referencing the given user across the cascadable tables in the current workspace. Writes a gdpr_actions audit row with action='export'.Required role: ADMIN or OWNER (manage permission). MANAGER is intentionally not a SAR actor — auditor framing is Compliance/Founder separation of duties.
Audit row in gdpr_actions for this SAR call — the operator’s defensible artefact.
peer_cards
array
Per-agent peer card rows referencing this user.
memory_versions
array
Memory rows with data_subject_id = userId. Content blob is not inlined — payload_ref points at the content-addressed store.
inbox_items
array
Inbox items addressed to or referencing this user.
lessons.md content is not scanned for user mentions — a known gap (see GDPR guide). Operators must manually review lessons after a SAR if any lesson body could carry user-attributable text.
GDPR Art. 17 (Right to Erasure) — cascade purge every row referencing the user across the cascadable tables. Writes a gdpr_actions audit row with action='delete'. reason is required for the audit trail.Required role: ADMIN or OWNER (manage permission).
202 Accepted on full success; 207 Multi-Status on partial success (some tables purged, others returned errors — the gdpr_actions row carries the full per-table summary).
On partial failure the same shape is returned with an additional "error": "<first-error>" field and HTTP 207.
Field
Type
Description
action_id
string
The audit row id. The same SAR re-run writes a second action row — idempotent in row counts (zero on second run) but the audit trail records every attempt.
rows_deleted
int
Sum across all cascadable tables.
scope
object
Per-table count summary (extensible — new keys appear additively as new tables join the cascade).
error
string?
Present only on HTTP 207. First error encountered; consult the audit row for the full per-table picture.
Memory payload_ref content-addressed blobs on disk are not deleted by this endpoint — blobs are deduplicated across workspaces and require a separate sweep job (planned). The audit/index rows ARE purged so the SAR is honoured at the DB-visibility layer. See the GDPR guide for the operator workflow.
Inspect and tune the memory subsystem: aggregate stats, row-level version drill-down, raw blob retrieval, and the per-workspace retention config. All require ADMIN or OWNER (manage permission).
Returns aggregate statistics for the memory subsystem within the current workspace — totals, per-tier rollups, and per-agent rollups derived from memory_versions.Required role: ADMIN or OWNER (manage permission)
Row-level drill-down into memory_versions. Pairs with the stats endpoint above: stats answers “how much memory does this workspace have?”, versions answers “which rows specifically?”. Results are ordered newest-first by written_at DESC, id DESC and paginated via an opaque keyset cursor.Required role: ADMIN or OWNER (manage permission)
Opaque cursor for the next page; null on last page
limit
integer
The resolved page size
filters_applied
object
Echo of the normalised filters that produced this response
The cursor is a base64url-encoded v1:<rfc3339nano>|<id> tuple pinning (written_at, id). Offset pagination would duplicate or skip rows because the audit watcher writes continuously; keyset pagination pins the boundary so concurrent inserts above the cursor land on the next refresh naturally.
Returns the raw blob bytes for a single memory_versions row. Used by the dashboard’s row-detail view and by compliance auditors who need the literal content (not just the metadata) — for example, to confirm a PII scrubber fired on the offending payload.Required role: ADMIN or OWNER (manage permission)
The body is the raw blob bytes (NOT JSON-wrapped). Content-Type is text/markdown; charset=utf-8 for paths ending in .md, otherwise application/octet-stream so the client cannot auto-render untrusted bytes as HTML.Audit metadata travels alongside the body via response headers:
Header
Description
X-Memory-Sha256
sha256 from the memory_versions row
X-Memory-Bytes
Length from the row (matches the row’s recorded size)
X-Memory-Tier
Tier (agent / crew / workspace / pins / learned)
X-Memory-Path
Canonical memory path
X-Memory-Written-At
RFC3339 timestamp
X-Memory-Written-By
Writer identifier (omitted when NULL)
Cache-Control
private, max-age=31536000, immutable — blobs are content-addressed
Missing workspace context OR missing id path segment
403
MEMBER role
404
Unknown id OR cross-workspace probe (no existence leak)
410
Row exists but the blob file is missing on disk (retention sweep, restore-from-backup race)
413
Blob exceeds the 10 MB cap (DB-claimed size OR on-disk size)
500
sha mismatch (blob tampered after recording) OR payload_ref resolves outside the configured blob root
503
Memory versioning is not configured (lite-mode deployment without blob root)
The handler refuses to follow symlinks under payload_ref. The on-disk layout is fixed at blobRoot/<sha[:2]>/<sha>; filepath.EvalSymlinks is used to verify the resolved path stays inside the blob root, defending against path-traversal vectors in a corrupted or malicious payload_ref.
Returns the per-workspace memory configuration. Drives the retention sweep (versions_retention_days) and is the operator’s read surface for inspecting drift between “what’s stored on the row” and “what’s effective”.Required role: ADMIN or OWNER (manage permission)
Partial-merge update of the per-workspace memory configuration. Merges the request body’s keys into the existing JSON document; unspecified keys are preserved.Required role: ADMIN or OWNER (manage permission)
Returns the post-merge config in the same shape as the GET response above. A PATCH that produces no diff (e.g. resetting to the same value) returns 200 with the current shape and emits NO journal entry — the audit trail tracks actual change, not request count.
Missing workspace context, malformed JSON, trailing garbage after the JSON value, empty body, or versions_retention_days outside [1, 3650] / not a positive integer
403
MEMBER role
413
Request body exceeds 16 KB
500
UPDATE / commit failure
The read-merge-write runs inside a SQLite BEGIN IMMEDIATE (serializable) transaction so concurrent PATCHes touching different keys serialise rather than last-write-wins. Each real diff emits a memory.config_updated journal entry (Notice severity, ActorUser) with payload {workspace_id, changes: {field: {from, to}}}. If the stored JSON is corrupt, PATCH still succeeds (treats the existing document as empty) so operators can fix the row without resorting to manual SQL.
The operator-facing Keeper audit log, plus the Phase 2 F4 evaluator routes. The audit-log read requires ADMIN or OWNER (manage permission); the Phase 2 evaluator routes are internal-auth (see below).
Returns the Keeper access request audit log — every credential access and command execution request evaluated by the Keeper.Required role: ADMIN or OWNER (manage permission)
[ { "id": "request-uuid", "agent_id": "agent-uuid", "agent_name": "Viktor", "crew_id": "crew-uuid", "credential_id": "cred-uuid", "credential_name": "Anthropic Production", "intent": "Need to call the Claude API to analyze code", "request_type": "credential", "command": null, "decision": "ALLOW", "reason": "Low-risk API call with clear intent", "risk_score": 2, "exit_code": null, "ollama_prompt": "...", "ollama_raw_response": "...", "created_at": "2025-01-15T10:30:00Z", "decided_at": "2025-01-15T10:30:01Z" }]
Field
Type
Description
id
string
Request ID
agent_id
string
Requesting agent ID
agent_name
string
Agent display name
crew_id
string
Crew the agent belongs to
credential_id
string
Credential being accessed
credential_name
string
Credential display name
intent
string
Agent’s stated intent for access
request_type
string
One of credential / execute / skill_review / behavior / memory_health / negative_learning (see Keeper Phase 2 below)
command
string?
Shell command (for execute requests); omitted when null
decision
string?
ALLOW, DENY, or ESCALATE
reason
string?
LLM-generated explanation
risk_score
integer?
Risk assessment (1-10 scale)
exit_code
integer?
Command exit code (for execute requests); omitted when null
ollama_prompt
string?
Full prompt sent to the Keeper LLM; omitted when null
ollama_raw_response
string?
Raw LLM response text; omitted when null
created_at
string
ISO 8601 request timestamp
decided_at
string?
ISO 8601 decision timestamp
Phase 2 request types (skill_review, behavior, memory_health, negative_learning) populate the same audit log surface as Phase 1 (credential, execute) with the same shape. The intent column carries the F4 evaluator’s structured summary instead of a free-form access reason; the ollama_prompt + ollama_raw_response capture the LLM evaluation round-trip. Filter by request_type to slice the log into per-evaluator views.
Phase 2 endpoints are internal-auth (X-Internal-Token) — they’re invoked by the platform itself (scheduler routines + the post-tool-call hook), not by operators directly. The admin-facing surface is the /api/v1/admin/keeper/requests log above, plus the per-type filters in the admin UI’s “Keeper P2 reviews” panel (PR-F2).
If an operator needs to trigger an evaluator ad-hoc (debugging, manual re-evaluation), the routes are reachable via the internal token. The expected production path is automated: routines fire on cron, the behavior hook fires on tool-call sampling.
F4.1 — periodic skill audit. Cron-fires daily 03:00 UTC per Scheduler.RegisterPlatformRoutine. The evaluator reads each skills row + skill_invocations history, asks the F3 Curator aux model whether the skill should stay active, transition to stale (no recent invocations), or be archived (failures dominating). DENY decisions write a blocking inbox row; ALLOW updates skills.lifecycle_state in place.Request body:
MUST match the request context workspace_id (see Internal IPC).
crew_id
yes
Used for per-crew policy resolution.
skill_id
yes
The skill being reviewed.
skill_name, skill_description, lifecycle_state
yes
Current state snapshot the LLM evaluates.
last_used_at
optional
RFC3339; empty = “never”.
assignments, assigned_agents
optional
Fan-out context for ESCALATE → inbox routing.
stats
optional
30-day invocation + failure counts.
failure_snippets
optional
Up to 3 most-recent failure excerpts (truncated).
Response:
{ "request_id": "kpr_skr_a3f8e2…", "decision": "ALLOW", "reason": "Skill still actively used (4 invocations in 30d, 0 failures)", "risk_score": 1, "verify_after_decide": true, "unverify_after_decide": false, "proposed_lifecycle": "active"}
DENY routes a blocking inbox_items row to the assigned agents’ workspace (per assigned_agents fan-out). ESCALATE routes a MANAGER-targeted blocking row.
F4.2 — post-tool-call behavior monitor. Fires from behaviorhook.MaybeEvaluate (orchestrator EventPostToolCall event), sampled at the per-crew rate (default 1-in-5). The evaluator reads the (tool_name, tool_args_snippet, current crew behavior_mode) triple and returns ALLOW / DENY / ESCALATE.behavior_mode=warn (default): DENY → non-blocking inbox; agent’s NEXT tool call proceeds. behavior_mode=block: DENY → blocking inbox + ShouldBlock=true in the response so the orchestrator interrupts the agent’s next call. Forbidden combination (autonomy=full + behavior_mode=block) rejected at API + DB layer.Request body:
{ "request_id": "kpr_bhv_b71e9c…", "decision": "DENY", "reason": "Destructive command without scoped path; recommend restricting to a specific directory", "risk_score": 7, "should_block": true, "policy_decision": "block_inbox"}
policy_decision is the resolved per-crew policy verdict (e.g. inbox_approve, auto_log_inbox, block_inbox) that drives whether an inbox row is written and whether it blocks.
F4.3 — periodic memory consolidation review. Cron-fires daily 03:30 UTC. The evaluator consumes a consolidate.HealthSnapshot (the 5-metric health score: Freshness / Coverage / Coherence / Efficiency / Reachability, each in [0, 100], plus the weighted Overall) and decides whether to auto-trigger a consolidation routine. Staleness and contradiction counts travel as separate top-level body fields, not inside snapshot.Request body:
snapshot is the consolidate.HealthSnapshot Go struct serialised without JSON tags, so its wire keys are PascalCase (Freshness, Reachability, Overall, …) and each metric is on a 0–100 scale — not the reachability_pct / 0–1 shape an external caller might guess. stalest_entry_days and contradiction_count are top-level fields alongside snapshot, not nested inside it.
F4.4 — failure-event lesson capture. Fires after a guardrail trip, run failure, or explicit operator “log this lesson” action. The evaluator decides whether the failure is worth a kind=negative lesson in the agent’s lessons.md. ALLOW writes through consolidate.WriteLesson (PR-Z Z.7) which enforces YAML schema + idempotency by ID + flock + atomic-rename.Self-learning gate: ALLOW auto-applies only when the agent has self_learning_enabled = 1 (migration v106). With self_learning_enabled = 0 (default), ALLOW queues a blocking inbox row with the full lesson proposal in payload_json and the marker "self_learning_gate": "off" so the UI can distinguish the gate-demoted path. See Autonomy + self-learning.Trigger kinds:run_failed, guardrail_warn, guardrail_error, keeper_execute_deny.Request body:
When self_learning_enabled = 0 the response still says write_lesson: true (the evaluator’s intent), but no lesson lands on disk — the operator must approve via inbox. Check gdpr_actions-style audit via GET /api/v1/admin/keeper/requests?request_type=negative_learning.
How the four evaluators enforce workspace isolation
All four endpoints assert body.workspace_id == ctx.workspace_id via assertBodyWorkspaceMatchesCtx before any evaluator runs. Asymmetric forgery (query=A, body=B) returns 400 Bad Request. Empty ctx workspace also returns 400 — the gate refuses to operate without the middleware that’s supposed to set it. Symmetric forgery (caller picks one workspace consistently) requires PR-F24 token-to-workspace binding to close fully. See Internal IPC — Tenant isolation.
Small cross-cutting endpoints that don’t belong to any single domain handler. They surface install state, telemetry consent, the running binary’s version, and dashboard time-series metrics.
setup-status and telemetry are intentionally unauthenticated because the login page needs to read them before any session exists; version and metrics/timeseries require an authenticated user (metrics/timeseries also needs workspace context).
First-run gate. Returns whether the install needs to be bootstrapped (empty users table) and whether public signup is enabled. The login page calls this on every page paint — when needs_bootstrap is true, the browser routes to /bootstrap instead of /login.Auth: none — the answer is what tells the browser which page to render.
Read-only consent gate for the frontend’s Sentry client. The Next.js sentry.client.config.ts fetches this before calling Sentry.init and bails out if enabled=false. Consent is flipped via the CLI (crewship telemetry on/off), never over HTTP — making this endpoint mutating would create a CSRF vector that flips the bit on every cross-site navigation.Auth: none — the login page must boot crash reporting before any session exists.
Reports the running binary’s version and (cache-permitting) the latest release from GitHub. The web UI uses this to render an “update available” banner.Auth: required (any authenticated user, no workspace role needed).
The running binary’s version (SetVersion-injected from cmd_start).
latest
string?
Latest release tag from GitHub. null on a cold cache + GitHub timeout.
newer
boolean
true when latest > current by semver.
url
string?
Release page URL. null paired with nulllatest.
The handler imposes a 4 s upper bound on top of the update package’s 5 s internal HTTP timeout — a cold cache + slow network still returns “no info” rather than blocking the UI render.
Bucketed time-series metrics for the dashboard charts. Returns zero-filled bucket sequences so the client never has to patch visual gaps. Reads workspace_id from the request context, never from a query param.Auth: required + workspace context.
Bucket-start in UTC, aligned to wall-clock boundaries (15m → :00/:15/:30/:45, 1h → top of hour, 1d → UTC midnight).
buckets[].series
object
Map of series key to numeric value. Always floats on the wire so cost and counts share the JSON shape.
series_labels
object
Map of series key to display label (e.g. crew name, model name, status). For group_by=none always contains {"total": "Total"}.
Status
Condition
400
Unknown metric / window / bucket / group_by; bucket larger than window; combination produces >200 buckets; group_by=model for a metric other than cost_usd; group_by=crew for a metric other than issues_closed/runs_count; group_by=status for a metric other than issues_closed/active_missions.
Licenses are verified using Ed25519 digital signatures:
1
Signed format
A license file contains a JSON object with payload (the claims as a JSON string) and signature (base64-encoded Ed25519 signature).
2
Public key embedding
The Ed25519 public key is embedded into the binary at build time via ldflags. This prevents license tampering by tying verification to the specific build.
3
Signature verification
On startup, Crewship decodes the public key and signature from base64, then verifies the payload using ed25519.Verify().
4
Expiration check
If the license has an expires_at timestamp and it is in the past, the license is rejected and community defaults apply.
5
Fallback
If no license file exists, verification fails, or the license is expired, Crewship runs with community edition defaults.
The public key variable is set at build time. Without a valid public key embedded in the binary, license loading will fail with “no public key embedded in binary” and community defaults will apply.