The Crew Journal is an append-only event stream backed by the journal_entries table (migration 52). Every observable action in Crewship lands here as one row: peer conversations, mission transitions, keeper decisions, LLM calls, approvals, checkpoints, hook fires, exec/network/file events.Downstream features — Paymaster, Watch Roster, Episodic Memory, Cartographer — are read-models or middleware over this one stream. If it happened in the platform and nobody can see it in the journal, it didn’t happen.
CREATE TABLE journal_entries ( id TEXT PRIMARY KEY, -- j_<16-hex> workspace_id TEXT NOT NULL, crew_id TEXT, agent_id TEXT, mission_id TEXT, ts TEXT NOT NULL, -- RFC3339Nano, milli precision entry_type TEXT NOT NULL, -- see catalog below severity TEXT NOT NULL DEFAULT 'info', priority TEXT NOT NULL DEFAULT 'normal', -- normal|high|pin|permanent actor_type TEXT NOT NULL, actor_id TEXT, summary TEXT NOT NULL, -- one-line human string payload TEXT NOT NULL DEFAULT '{}',-- typed JSON by entry_type refs TEXT NOT NULL DEFAULT '{}',-- parent_entry_id, approval_id, ... trace_id TEXT, span_id TEXT, expires_at TEXT);
A contentless FTS5 virtual table journal_entries_fts (migration 55) mirrors summary and payload, with insert/update/delete triggers that keep it in sync with the base table.Entries are immutable — corrections are new entries with refs.parent_entry_id. IDs are 64-bit random hex (j_a1b2c3d4e5f60718).
Since PR #234 (migration 61) the legacy agent_runs table no longer exists; a “run” is reconstructed by grouping journal entries on trace_id (which equals the run id). Five typed entries cover every transition:
Entry type
When
run.started
Orchestrator picked up the assignment and exec’d the CLI.
run.completed
CLI returned exit 0 within budget.
run.failed
Non-zero exit, panic, or upstream API error.
run.cancelled
Operator or hook called stop.
run.timeout
Run TTL elapsed without a terminal entry.
Trace-id propagation is explicit: emit sites attach the id to context with journal.WithRunID(ctx, runID), and downstream emitters read it back via RunIDFromContext(ctx). The noop emitter loudly errors on run.* types so a misconfigured wiring fails immediately rather than silently dropping observability.journal.ListRuns, journal.RunStats and journal.RunInsights reconstruct the row shape, KPIs and windowed operations aggregate that used to come from agent_runs. The /journal UI’s Runs tab (a fleet operations overview — see below) reads all three.
Emitted by internal/containerstate after every successful agent exec. The package probes dpkg-query -W, pip freeze, npm ls -g --json, and /etc/os-release; the result is hashed; the entry is emitted only when the hash changes — so a quiet session is free. Missing probes (e.g. no pip in a Node-only image) soft-fail to empty lists. The payload structure:
This is what the container actually has after agents ran apt-get install / pip install / npm install during a session. Compare with declared intent in devcontainer.json.Severity is one of info, notice, warn, error. Actor types: agent, user, system, keeper, sidecar, orchestrator.
Priority is an operator-facing importance marker orthogonal to Severity.
Severity answers “how alarming?”; Priority answers “how long do we
remember it and how prominently should it surface at recall time?”.
Marker
Effect
normal (default)
No special treatment.
high
Recall importance floors at 0.85; subject to normal compaction.
pin
Snapshot to /crew/shared/.memory/pins.md at next consolidate run; recall importance floors at 0.80.
permanent
Never compacted. Extracted to learned-*.md on the next consolidator run, skipping the 10-entry threshold (extraction happens on the cadence, not the instant the marker is set). Recall importance floors at 0.95.
Markers are set via the HTTP endpoint or CLI — agents cannot mark their
own outputs (keeps automation from unilaterally promoting its own
memory). OWNER and ADMIN roles only.
The consolidator (internal/consolidate) honours permanent as a
fast-path signal — a single permanent entry triggers rule extraction
on the next run regardless of volume. The compactor (internal/ consolidate/compact.go) excludes permanent from the 30-day rollup
via a WHERE priority != 'permanent' clause, so deliberately-pinned
knowledge survives the life of the DB.The three-level priority scheme (permanent / high / pin) is
deliberately small so operator intent is unambiguous and the
consolidator can act on a single field.
type Emitter interface { Emit(ctx context.Context, e Entry) (string, error) Flush(ctx context.Context) error}
The production journal.Writer is constructed once at server start and reached via Router.Journal(). Handlers emit without nil checking (noopEmitter is the default):
Emit is asynchronous: entries are queued (buffer 1024), a background goroutine batches up to 64 rows or flushes every 100ms. When the queue is saturated the call falls back to a synchronous write — durability over latency. Call Flush before tearing down a long-running test to guarantee all prior emits are on disk.
GET /api/v1/journal — paginated list, newest first. See the API reference.
GET /api/v1/journal/stream — SSE live tail; seeds with the most recent 50 entries, polls every 1s, emits event: entry frames, and heartbeats every 15s. Reconnect with Last-Event-ID to skip already-seen rows.
GET /api/v1/journal/count — total matching count for a filter set; ignores cursor/limit so the badge stays honest.
GET /api/v1/journal/{id} — single entry (scoped to workspace; cross-tenant IDs return 404).
POST /api/v1/journal/{id}/priority — annotate an entry with normal/high/pin/permanent. OWNER or ADMIN only; emits a memory.priority_changed audit row.
See crewship journal for the full flag reference. The CLI implements live tail via SSE (--follow) with bounded reconnect backoff and Last-Event-ID resume.
Migration 55 adds a contentless FTS5 virtual table journal_entries_fts mirroring summary and payload, with insert / update / delete triggers that keep it in sync with the base table. The ?q= query parameter (CLI: --query / -q) compiles to a phrase-wrapped MATCH against this index:
GET /api/v1/journal?q=ratelimit&since=2026-04-29T00:00:00ZGET /api/v1/journal/stream?q=approval # live tail filtered by FTS
The same parameter works on the SSE stream — the seed slice and every subsequent poll apply the FTS filter, so a “watch for OOM” tab stays cheap. q is bounded in length (rejected if absurd) and merges with structural filters (crew_id, severity, …) via AND.
Returns a workspace-scoped map of crews, agents, and missions so the UI can render entry cards with palette-coloured chips and lucide icons without joining on the streaming path. The lookup payload is fetched once on page mount (the React JournalLookupProvider caches it) and invalidated by realtime events (new crew, renamed agent, …). useJournalLookup is the consumer hook; backend handler is internal/api/journal_lookup.go.The endpoint returns:
Entries themselves never carry display strings — the journal stores stable IDs only. Renaming a crew updates the lookup on the next fetch; historical journal rows continue to show the new name.
Since PR #234 the standalone /runs page is folded into /journal as a preset tab. The /runs URL serves a redirect to /journal?tab=runs. The tab strip is Timeline | Runs | Stats:
Timeline — chronological event stream with FTS, severity, type, and crew filters.
Runs — a fleet operations overview (not just a list). Because it reads the run superset — every run in the workspace, including ad-hoc agent/chat/delegation runs that never touch a routine — it surfaces breakdowns the routine-scoped Routines → Insights view structurally can’t. See below.
Stats — workspace KPIs (calls/day, top entry types, top error types).
All three share the same SSE stream and FTS index — selecting a tab does not retrigger a fetch, only a client-side filter shift. Audit-tab navigation is dropped (it was always a redirect placeholder); the /audit page remains in the sidebar for security review.
The Runs tab has four sections. The KPI row and breakdowns (2-3) are scoped by a
24h / 7d / 30d window selector; the live pulse and recent-runs table (1, 4) are not:
Live pulse — every currently-running execution across the fleet, with live-ticking elapsed time. Sourced from /api/v1/runs?status=RUNNING.
KPI row — total runs in the window with an outcome split-bar (succeeded vs failed), success rate, failure count, and median / p95 duration.
Breakdowns — by trigger (schedule / agent / user / webhook / system), top crews (volume + fail rate), and by model (e.g. Opus vs Sonnet — the resolved model recorded on each run).
Recent runs table — a filterable list (status + trigger); each row deep-links to that run’s trace in the Timeline. Includes the resolved Model column.
Sections 2 and 3 are backed by a dedicated aggregate:
GET /api/v1/runs/insights?window=24h # 24h | 7d | 30d
Response: { window, totals{total,succeeded,failed,running}, duration{p50_ms,p95_ms}, by_trigger[], by_model[], by_crew[], top_agents[], truncated }. It reconstructs runs
by grouping journal_entries on trace_id over the window (journal.RunInsights),
folds the outcome / duration / breakdown counters in Go, and the API layer resolves
agent_id → crew + display names. Aggregation is bounded to the most-recent
maxInsightRows runs in the window; when that cap is hit, truncated is true and the
UI says so rather than presenting a partial total as complete.CLI parity (drive it the same way an agent would):
crewship run insights # last 24hcrewship run insights --window 7dcrewship run insights -o json # machine-readable for scripting
Workspace isolation is enforced at the store level (journal.List/Get/Count take a workspace filter and refuse to run without one). The handler additionally pulls workspace_id from the session context — there is no way for a caller to pass a foreign workspace id through a query parameter.Cross-tenant existence is never leaked: unknown IDs return 404 with the same shape as “not in your workspace”. The shared crewBelongsToWorkspace / missionBelongsToWorkspace helpers (defined in internal/api/paymaster_handler.go and reused across every read handler) enforce the same contract — see the Paymaster API reference for the endpoints that exercise them.
expires_at is an optional TTL; compaction skips rows past it.
The daily Compactor (see Consolidate) rolls up info/notice rows older than 30 days into one system.compaction entry and deletes the originals. warn/error rows are kept indefinitely.
exec.output_chunk, container.metrics, and network.* are never embedded into Episodic memory because they would drown the signal.
TS precision. Writes serialise as 2006-01-02T15:04:05.000Z (milli). Reads also accept RFC3339Nano and second-precision strings so backfilled rows don’t fail to parse.
Empty trace/span. The trace_id/span_id columns are populated by the tracing package’s SetTraceResolver. If OpenTelemetry is not initialised the columns stay NULL — this is fine, not a bug.
Do not rename entry types. A rename breaks every existing row. Add a new type and dual-write during the transition instead.