> ## Documentation Index
> Fetch the complete documentation index at: https://docs.crewship.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Crew Journal

> Append-only event stream that is the single source of truth for every observable action in the platform.

# Crew Journal

The Crew Journal is an append-only event stream backed by the `journal_entries` table (migration 52). Every observable action in Crewship lands here as one row: peer conversations, mission transitions, keeper decisions, LLM calls, approvals, checkpoints, hook fires, exec/network/file events.

Downstream features -- [Paymaster](/guides/paymaster), [Watch Roster](/guides/watch-roster), [Episodic Memory](/guides/episodic-memory), [Cartographer](/guides/cartographer) -- are read-models or middleware over this one stream. If it happened in the platform and nobody can see it in the journal, it didn't happen.

## Schema

```sql theme={null}
CREATE TABLE journal_entries (
  id           TEXT PRIMARY KEY,          -- j_<16-hex>
  workspace_id TEXT NOT NULL,
  crew_id      TEXT,
  agent_id     TEXT,
  mission_id   TEXT,
  ts           TEXT NOT NULL,             -- RFC3339Nano, milli precision
  entry_type   TEXT NOT NULL,             -- see catalog below
  severity     TEXT NOT NULL DEFAULT 'info',
  priority     TEXT NOT NULL DEFAULT 'normal', -- normal|high|pin|permanent
  actor_type   TEXT NOT NULL,
  actor_id     TEXT,
  summary      TEXT NOT NULL,             -- one-line human string
  payload      TEXT NOT NULL DEFAULT '{}',-- typed JSON by entry_type
  refs         TEXT NOT NULL DEFAULT '{}',-- parent_entry_id, approval_id, ...
  trace_id     TEXT,
  span_id      TEXT,
  expires_at   TEXT
);
```

A contentless FTS5 virtual table `journal_entries_fts` (migration 55) mirrors `summary` and `payload`, with insert/update/delete triggers that keep it in sync with the base table.

Entries are immutable -- corrections are new entries with `refs.parent_entry_id`. IDs are 64-bit random hex (`j_a1b2c3d4e5f60718`).

### Entry type catalog

Defined in `internal/journal/types.go`. Every type is a stable string -- renames require a backfill migration.

<Accordion title="Full entry-type catalog by bucket">
  | Bucket                   | Entry types                                                                                                                                                                 |
  | ------------------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
  | Communication            | `peer.conversation`, `peer.escalation`, `message.broadcast`, `agent.mentioned`                                                                                              |
  | Mission                  | `mission.status_change`, `mission.comment`, `assignment.created/running/completed/failed`, `crew.action`, `task.delegated`                                                  |
  | **Runs** (since PR #234) | `run.started`, `run.completed`, `run.failed`, `run.cancelled`, `run.timeout`                                                                                                |
  | Security                 | `keeper.request`, `keeper.decision`, `keeper.rule_auto_tuned`, `guardrail.input_blocked`, `guardrail.output_blocked`, `approval.requested/granted/denied/timeout/cancelled` |
  | Cost                     | `llm.call`, `llm.cache_hit`, `cost.incurred`, `budget.exceeded`, `budget.warning`                                                                                           |
  | Memory                   | `memory.updated`, `memory.consolidated`, `memory.priority_changed`, `summary.generated`                                                                                     |
  | Observability            | `exec.command`, `exec.output_chunk`, `network.port_opened/closed`, `network.egress`, `file.written`, `container.metrics`, `container.snapshot`                              |
  | Presence                 | `agent.status_change`                                                                                                                                                       |
  | Checkpointing            | `checkpoint.created`, `checkpoint.restored`, `fork.created`                                                                                                                 |
  | Eval                     | `eval.run_started`, `eval.metric`, `eval.regression_detected`                                                                                                               |
  | Hooks                    | `hook.fired`, `hook.blocked`                                                                                                                                                |
  | Pipeline                 | `pipeline.run.started/completed/failed`, `pipeline.step.started/completed/failed/validation_failed`, `pipeline.dry_run`                                                     |
  | Credentials              | `credential.auto_assign_failed`, `credential.auto_assign_empty`                                                                                                             |
  | Skills                   | `skill.imported`, `skill.deleted`, `skill.assigned`, `skill.unassigned`, `skill.invoked`                                                                                    |
  | Audit                    | `audit.entity_created/updated/deleted/restored` (mirrors `audit_logs`)                                                                                                      |
  | Provisioning             | `provisioning.queued/building/complete/failed` (crew runtime container build)                                                                                               |
  | Chat                     | `chat.user_message`, `chat.agent_response` (user↔agent turns)                                                                                                               |
  | Agent error              | `agent.error` (panic recoveries + stream-handling failures)                                                                                                                 |
  | System                   | `system.compaction`, `system.migration`, `system.hook_toggled`, `system.consolidation_triggered/completed`                                                                  |
</Accordion>

#### `run.*` — agent run lifecycle

Since PR #234 (migration 61) the legacy `agent_runs` table no longer exists; a "run" is reconstructed by grouping journal entries on `trace_id` (which equals the run id). Five typed entries cover every transition:

| Entry type      | When                                                      |
| --------------- | --------------------------------------------------------- |
| `run.started`   | Orchestrator picked up the assignment and exec'd the CLI. |
| `run.completed` | CLI returned exit 0 within budget.                        |
| `run.failed`    | Non-zero exit, panic, or upstream API error.              |
| `run.cancelled` | Operator or hook called stop.                             |
| `run.timeout`   | Run TTL elapsed without a terminal entry.                 |

Trace-id propagation is explicit: emit sites attach the id to context with `journal.WithRunID(ctx, runID)`, and downstream emitters read it back via `RunIDFromContext(ctx)`. The `noop` emitter loudly errors on `run.*` types so a misconfigured wiring fails immediately rather than silently dropping observability.

`journal.ListRuns`, `journal.RunStats` and `journal.RunInsights` reconstruct the row shape, KPIs and windowed operations aggregate that used to come from `agent_runs`. The `/journal` UI's `Runs` tab (a fleet operations overview — see [below](#runs-fleet-operations-overview)) reads all three.

#### `container.snapshot` — container actuals

Emitted by `internal/containerstate` after every successful agent exec. The package probes `dpkg-query -W`, `pip freeze`, `npm ls -g --json`, and `/etc/os-release`; the result is hashed; the entry is emitted **only** when the hash changes — so a quiet session is free. Missing probes (e.g. no `pip` in a Node-only image) soft-fail to empty lists. The payload structure:

```json theme={null}
{
  "os_name": "Debian GNU/Linux 12 (bookworm)",
  "packages": {
    "apt": [{"name": "curl", "version": "7.88.1-10+deb12u5"}, ...],
    "pip": [{"name": "requests", "version": "2.31.0"}, ...],
    "npm": [{"name": "typescript", "version": "5.4.5"}, ...]
  },
  "hash": "sha256:..."
}
```

This is **what the container actually has** after agents ran `apt-get install` / `pip install` / `npm install` during a session. Compare with declared intent in [`devcontainer.json`](/guides/devcontainers#container-actuals).

Severity is one of `info`, `notice`, `warn`, `error`. Actor types: `agent`, `user`, `system`, `keeper`, `sidecar`, `orchestrator`.

## Priority markers

Priority is an operator-facing importance marker orthogonal to Severity.
Severity answers *"how alarming?"*; Priority answers *"how long do we
remember it and how prominently should it surface at recall time?"*.

| Marker             | Effect                                                                                                                                                                                                               |
| ------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `normal` (default) | No special treatment.                                                                                                                                                                                                |
| `high`             | Recall importance floors at 0.85; subject to normal compaction.                                                                                                                                                      |
| `pin`              | Snapshot to `/crew/shared/.memory/pins.md` at next consolidate run; recall importance floors at 0.80.                                                                                                                |
| `permanent`        | Never compacted. Extracted to `learned-*.md` on the next consolidator run, skipping the 10-entry threshold (extraction happens on the cadence, not the instant the marker is set). Recall importance floors at 0.95. |

<Note>
  Markers are set via the HTTP endpoint or CLI — agents cannot mark their
  own outputs (keeps automation from unilaterally promoting its own
  memory). OWNER and ADMIN roles only.
</Note>

```bash theme={null}
# HTTP
curl -X POST https://host/api/v1/journal/j_abc/priority \
  -H "Authorization: Bearer $TOKEN" \
  -d '{"priority":"permanent","reason":"FX compliance constraint"}'

# CLI
crewship journal priority j_abc --mark permanent --reason "FX compliance constraint"
```

The consolidator (`internal/consolidate`) honours `permanent` as a
fast-path signal — a single permanent entry triggers rule extraction
on the next run regardless of volume. The compactor (`internal/
consolidate/compact.go`) excludes `permanent` from the 30-day rollup
via a `WHERE priority != 'permanent'` clause, so deliberately-pinned
knowledge survives the life of the DB.

The three-level priority scheme (`permanent` / `high` / `pin`) is
deliberately small so operator intent is unambiguous and the
consolidator can act on a single field.

## Writing

The write surface is `journal.Emitter`:

```go theme={null}
type Emitter interface {
    Emit(ctx context.Context, e Entry) (string, error)
    Flush(ctx context.Context) error
}
```

The production `journal.Writer` is constructed once at server start and reached via `Router.Journal()`. Handlers emit without nil checking (`noopEmitter` is the default):

```go theme={null}
_, _ = h.journal.Emit(ctx, journal.Entry{
    WorkspaceID: ws, CrewID: crew, AgentID: agent,
    Type:        journal.EntryKeeperDecision,
    Severity:    journal.SeverityWarn,
    ActorType:   journal.ActorKeeper,
    Summary:     "keeper denied production SSH",
    Payload:     map[string]any{"credential_id": credID, "risk_score": 8},
})
```

Emit is asynchronous: entries are queued (buffer 1024), a background goroutine batches up to 64 rows or flushes every 100ms. When the queue is saturated the call falls back to a synchronous write -- durability over latency. Call `Flush` before tearing down a long-running test to guarantee all prior emits are on disk.

## Reading

### HTTP

* `GET /api/v1/journal` -- paginated list, newest first. See the [API reference](/api-reference/journal).
* `GET /api/v1/journal/stream` -- SSE live tail; seeds with the most recent 50 entries, polls every 1s, emits `event: entry` frames, and heartbeats every 15s. Reconnect with `Last-Event-ID` to skip already-seen rows.
* `GET /api/v1/journal/count` -- total matching count for a filter set; ignores cursor/limit so the badge stays honest.
* `GET /api/v1/journal/{id}` -- single entry (scoped to workspace; cross-tenant IDs return 404).
* `POST /api/v1/journal/{id}/priority` -- annotate an entry with `normal`/`high`/`pin`/`permanent`. OWNER or ADMIN only; emits a `memory.priority_changed` audit row.

### CLI

```bash theme={null}
crewship journal                                 # last 50 entries
crewship journal --crew backend-team --since 24h
crewship journal --severity warn,error
crewship journal --type peer.escalation,keeper.decision --lines 100
crewship journal --mission MIS-42 --format json
crewship journal --follow                        # live tail via SSE
crewship journal get j_a1b2c3d4                  # single entry
crewship journal count --severity error --since 24h
crewship journal priority j_a1b2c3d4 --mark permanent --reason "FX rule"
```

See [`crewship journal`](/cli/journal) for the full flag reference. The CLI implements live tail via SSE (`--follow`) with bounded reconnect backoff and `Last-Event-ID` resume.

### Filtering

Every filter is AND-combined and indexed at the DB level:

| Query param          | CLI flag                         | SQL predicate                                                  |
| -------------------- | -------------------------------- | -------------------------------------------------------------- |
| `crew_id`            | `--crew`                         | `crew_id = ?`                                                  |
| `agent_id`           | `--agent`                        | `agent_id = ?`                                                 |
| `mission_id`         | `--mission`                      | `mission_id = ?`                                               |
| `trace_id`           | `--trace-id`                     | `trace_id = ?` (one run's spans)                               |
| `crew_ids`           | --                               | CSV `crew_id IN (?,?,...)` (takes precedence over `crew_id`)   |
| `agent_ids`          | --                               | CSV `agent_id IN (?,?,...)` (takes precedence over `agent_id`) |
| `entry_type`         | `--type`                         | CSV `entry_type IN (?,?,...)`                                  |
| `exclude_entry_type` | `--exclude-type`                 | CSV `entry_type NOT IN (?,?,...)`                              |
| `severity`           | `--severity`                     | CSV `IN (?,?,...)`                                             |
| `actor_type`         | `--actor-type`                   | CSV `IN (?,?,...)`                                             |
| `priority`           | `--priority`                     | CSV `IN (?,?,...)`                                             |
| `since`, `until`     | `--since` (`--until` on `count`) | `ts >= ?` / `ts <= ?`                                          |
| `q`                  | `--query` / `-q`                 | FTS5 MATCH on `summary` + `payload` (see below)                |
| `cursor`             | --                               | keyset `(ts, id) <` prior page                                 |

Pagination is keyset (compound `ts, id`), not offset, so deep paging stays O(log n). `limit` is 1-500, default 100 for list, 50 for the SSE seed.

### Full-text search

Migration 55 adds a contentless FTS5 virtual table `journal_entries_fts` mirroring `summary` and `payload`, with insert / update / delete triggers that keep it in sync with the base table. The `?q=` query parameter (CLI: `--query` / `-q`) compiles to a phrase-wrapped `MATCH` against this index:

```bash theme={null}
crewship journal --query "OOM" --since 24h
crewship journal --query "google.com" --crew backend-team
```

```http theme={null}
GET /api/v1/journal?q=ratelimit&since=2026-04-29T00:00:00Z
GET /api/v1/journal/stream?q=approval         # live tail filtered by FTS
```

The same parameter works on the SSE stream — the seed slice and every subsequent poll apply the FTS filter, so a "watch for `OOM`" tab stays cheap. `q` is bounded in length (rejected if absurd) and merges with structural filters (`crew_id`, `severity`, ...) via AND.

### Lookup table (card enrichment)

```http theme={null}
GET /api/v1/journal/lookup
```

Returns a workspace-scoped map of `crews`, `agents`, and `missions` so the UI can render entry cards with palette-coloured chips and lucide icons without joining on the streaming path. The lookup payload is fetched once on page mount (the React `JournalLookupProvider` caches it) and invalidated by realtime events (new crew, renamed agent, ...). `useJournalLookup` is the consumer hook; backend handler is `internal/api/journal_lookup.go`.

The endpoint returns:

```json theme={null}
{
  "crews":    [{"id":"crw_…","slug":"backend","name":"Backend","icon":"server","color":"emerald"}, ...],
  "agents":   [{"id":"agt_…","slug":"viktor","name":"Viktor","crew_id":"crw_…","avatar":"…"}, ...],
  "missions": [{"id":"MIS-42","title":"Migrate auth","status":"in_progress","crew_id":"crw_…"}, ...]
}
```

Entries themselves never carry display strings — the journal stores stable IDs only. Renaming a crew updates the lookup on the next fetch; historical journal rows continue to show the new name.

## Unified runs surface

Since PR #234 the standalone `/runs` page is folded into `/journal` as a preset tab. The `/runs` URL serves a redirect to `/journal?tab=runs`. The tab strip is `Timeline | Runs | Stats`:

* **Timeline** — chronological event stream with FTS, severity, type, and crew filters.
* **Runs** — a **fleet operations overview** (not just a list). Because it reads the run superset — *every* run in the workspace, including ad-hoc agent/chat/delegation runs that never touch a routine — it surfaces breakdowns the routine-scoped [Routines → Insights](/guides/routines) view structurally can't. See below.
* **Stats** — workspace KPIs (calls/day, top entry types, top error types).

All three share the same SSE stream and FTS index — selecting a tab does not retrigger a fetch, only a client-side filter shift. Audit-tab navigation is dropped (it was always a redirect placeholder); the `/audit` page remains in the sidebar for security review.

### Runs — fleet operations overview

The Runs tab has four sections. The KPI row and breakdowns (2-3) are scoped by a
`24h / 7d / 30d` window selector; the live pulse and recent-runs table (1, 4) are not:

1. **Live pulse** — every currently-running execution across the fleet, with live-ticking elapsed time. Sourced from `/api/v1/runs?status=RUNNING`.
2. **KPI row** — total runs in the window with an outcome split-bar (succeeded vs failed), success rate, failure count, and median / p95 duration.
3. **Breakdowns** — by **trigger** (schedule / agent / user / webhook / system), **top crews** (volume + fail rate), and by **model** (e.g. Opus vs Sonnet — the resolved model recorded on each run).
4. **Recent runs table** — a filterable list (status + trigger); each row deep-links to that run's trace in the Timeline. Includes the resolved **Model** column.

Sections 2 and 3 are backed by a dedicated aggregate:

```
GET /api/v1/runs/insights?window=24h    # 24h | 7d | 30d
```

Response: `{ window, totals{total,succeeded,failed,running}, duration{p50_ms,p95_ms},
by_trigger[], by_model[], by_crew[], top_agents[], truncated }`. It reconstructs runs
by grouping `journal_entries` on `trace_id` over the window (`journal.RunInsights`),
folds the outcome / duration / breakdown counters in Go, and the API layer resolves
`agent_id` → crew + display names. Aggregation is bounded to the most-recent
`maxInsightRows` runs in the window; when that cap is hit, `truncated` is `true` and the
UI says so rather than presenting a partial total as complete.

CLI parity (drive it the same way an agent would):

```
crewship run insights                # last 24h
crewship run insights --window 7d
crewship run insights -o json        # machine-readable for scripting
```

## Tenancy

Workspace isolation is enforced at the store level (`journal.List`/`Get`/`Count` take a workspace filter and refuse to run without one). The handler additionally pulls `workspace_id` from the session context -- there is no way for a caller to pass a foreign workspace id through a query parameter.

Cross-tenant existence is never leaked: unknown IDs return 404 with the same shape as "not in your workspace". The shared `crewBelongsToWorkspace` / `missionBelongsToWorkspace` helpers (defined in `internal/api/paymaster_handler.go` and reused across every read handler) enforce the same contract — see the [Paymaster API reference](/api-reference/paymaster) for the endpoints that exercise them.

## Retention

* `expires_at` is an optional TTL; compaction skips rows past it.
* The daily **Compactor** (see [Consolidate](/guides/consolidate)) rolls up `info`/`notice` rows older than 30 days into one `system.compaction` entry and deletes the originals. `warn`/`error` rows are kept indefinitely.
* `exec.output_chunk`, `container.metrics`, and `network.*` are never embedded into [Episodic memory](/guides/episodic-memory) because they would drown the signal.

## Gotchas

* **TS precision.** Writes serialise as `2006-01-02T15:04:05.000Z` (milli). Reads also accept RFC3339Nano and second-precision strings so backfilled rows don't fail to parse.
* **Empty trace/span.** The `trace_id`/`span_id` columns are populated by the [tracing](/guides/tracing) package's `SetTraceResolver`. If OpenTelemetry is not initialised the columns stay NULL -- this is fine, not a bug.

<Warning>
  **Do not rename entry types.** A rename breaks every existing row. Add a new type and dual-write during the transition instead.
</Warning>

## Related

* [Paymaster](/guides/paymaster) -- reads `llm.call` and `cost.incurred`.
* [Watch Roster](/guides/watch-roster) -- emits `agent.status_change` on transitions.
* [Cartographer](/guides/cartographer) -- anchors checkpoints to the journal cursor.
* [Episodic memory](/guides/episodic-memory) -- selectively embeds high-signal entry types.
* [`crewship journal`](/cli/journal) CLI, [Journal API reference](/api-reference/journal).
