> ## Documentation Index
> Fetch the complete documentation index at: https://docs.crewship.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Episodic Memory

> Vector-similarity recall over the Crew Journal with selective embedding and SQLite BLOB brute-force cosine scan.

# Episodic Memory

<Note>
  **Auto-indexing is wired at boot.** When an embedder is configured (`KEEPER_OLLAMA_URL` pointing at an Ollama host serving `nomic-embed-text`), the server starts the background indexer sweeper automatically at boot — new embeddable journal entries become vector-searchable within one 30-second sweep, no manual indexing required.

  Without an embedder, episodic recall runs in **sparse-only mode** (keyword/FTS, no vector similarity). That degraded state is never silent: the server logs a WARN at boot, `GET /healthz` reports `"episodic": "sparse-only"` (vs `"vector"`), and `crewship doctor` surfaces it as a WARN with the fix.

  *Roadmap (v0.2):* the synchronous embed-on-write hook (`Indexer.IndexOne` directly from the journal writer hot path) is not yet wired — until then a freshly written entry can lag recall by up to one sweep interval.
</Note>

Episodic is a narrow vector-recall layer over the [Crew Journal](/guides/crew-journal). It embeds a **selective** subset of journal entries and serves top-K similarity queries to agents planning new work.

The "selective" part is load-bearing. Per 2025-2026 multi-agent memory research, indexing every event causes catastrophic drift: high-volume low-signal types (`exec.output_chunk`, `container.metrics`, `network.*`) drown the embedding space and dilute recall. Episodic refuses to embed those types and ingests only escalations, summaries, terminal mission status, denied keeper calls, eval regressions, and operator-tagged entries.

## Embedding criteria

```go theme={null}
// internal/episodic/types.go
func shouldEmbed(entryType string, severity string) bool {
    switch entryType {
    case "peer.escalation",
         "summary.generated",
         "memory.consolidated",
         "approval.denied",
         "eval.regression_detected":
        return true
    case "keeper.decision":
        return severity == "warn" || severity == "error"
    case "mission.status_change":
        return severity == "warn" || severity == "error"
    }
    return false
}
```

| Entry type                                            | When embedded                                                                                                                                                                                                                                                   |
| ----------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `peer.escalation`                                     | Always                                                                                                                                                                                                                                                          |
| `summary.generated`                                   | Always                                                                                                                                                                                                                                                          |
| `memory.consolidated`                                 | Always                                                                                                                                                                                                                                                          |
| `approval.denied`                                     | Always                                                                                                                                                                                                                                                          |
| `eval.regression_detected`                            | Always                                                                                                                                                                                                                                                          |
| `keeper.decision`                                     | Only at severity warn/error (denied or risky)                                                                                                                                                                                                                   |
| `mission.status_change`                               | Only at severity warn/error (failed or problematic)                                                                                                                                                                                                             |
| `peer.conversation`                                   | Listed in `EmbeddableEntryTypes` but `shouldEmbed` always returns false — plain Q\&A is too high-volume to embed. The indexer is expected to apply its own escalation-aware filter and force it via the override below when the question ends in an escalation. |
| `exec.output_chunk`, `container.metrics`, `network.*` | **Never**                                                                                                                                                                                                                                                       |

<Tip>
  To force an embedding on an otherwise-filtered entry, tag it with `refs.episodic = true` in the journal payload. The indexer checks this flag before applying the type/severity filter.
</Tip>

## Storage

`journal_embeddings` (added in migration 52 alongside `journal_entries`):

```sql theme={null}
CREATE TABLE journal_embeddings (
  entry_id           TEXT PRIMARY KEY REFERENCES journal_entries(id) ON DELETE CASCADE,
  vector             BLOB NOT NULL,     -- raw float32 bytes
  dim                INTEGER NOT NULL,
  model              TEXT NOT NULL,     -- "nomic-embed-text", etc.
  indexed_at         TEXT NOT NULL,
  importance_score   REAL DEFAULT 0.5,  -- fed into recall sort (see "Importance scoring")
  reference_count    INTEGER DEFAULT 0, -- bumped when a hit lands in a prompt
  last_referenced_at TEXT               -- stamped at the same time
);
```

Vectors are stored as little-endian float32 BLOBs -- native Go slice serialisation. SQLite has no `pgvector`, so recall is a **brute-force cosine scan** over the scope-filtered rows. For the expected scale (\~1% of entries embedded, low thousands per agent) the scan finishes in low milliseconds. If this outgrows SQLite the right move is an external vector store, not a SQLite extension -- so storage sits behind an interface.

### Migration 55 additions

Added in PR #212 alongside the memory uplift.

<Accordion title="Migration 55 schema — FTS5 mirror, archive sink, relation graph, health snapshots">
  ```sql theme={null}
  -- FTS5 contentless mirror of journal_entries.summary + payload.
  -- Insert/update/delete triggers keep it in lockstep with the base table.
  -- Powers the ?q= parameter on /api/v1/journal and the BM25 leg of HybridRecall.
  CREATE VIRTUAL TABLE journal_entries_fts USING fts5(
    summary, payload, content='journal_entries', content_rowid='rowid'
  );

  -- Compaction sink. Aged rows from journal_entries land here with payload
  -- truncated to 400 chars before being deleted from the base table.
  CREATE TABLE journal_entries_archived (
    id           TEXT PRIMARY KEY,
    workspace_id TEXT NOT NULL,
    ts           TEXT NOT NULL,
    entry_type   TEXT NOT NULL,
    severity     TEXT NOT NULL,
    summary      TEXT NOT NULL,
    payload      TEXT NOT NULL,                 -- 400-char truncation
    archived_at  TEXT NOT NULL
  );

  -- embedding relation graph: similarity edges linking related embeddings.
  CREATE TABLE memory_relations (
    src_entry_id TEXT NOT NULL,
    dst_entry_id TEXT NOT NULL,
    relation     TEXT NOT NULL,                 -- 'similar' | 'supports' | 'contradicts'
    weight       REAL NOT NULL DEFAULT 0.0,     -- cosine for 'similar', 1.0 for hand-asserted
    created_at   TEXT NOT NULL,
    PRIMARY KEY (src_entry_id, dst_entry_id, relation)
  );

  -- Daily 5-metric health score per workspace (and optionally per crew).
  CREATE TABLE memory_health_snapshots (
    workspace_id   TEXT NOT NULL,
    crew_id        TEXT,                        -- NULL = workspace-aggregate
    snapshot_date  TEXT NOT NULL,               -- YYYY-MM-DD UTC
    freshness      REAL NOT NULL,
    coverage       REAL NOT NULL,
    coherence      REAL NOT NULL,
    efficiency     REAL NOT NULL,
    reachability   REAL NOT NULL,
    score          REAL NOT NULL,               -- weighted sum (see below)
    PRIMARY KEY (workspace_id, crew_id, snapshot_date)
  );
  ```
</Accordion>

## Scopes

```go theme={null}
type Scope string
const (
    ScopeOwn        Scope = "own"         // agent's own past only
    ScopeCrewShared Scope = "crew_shared" // own + high-value crew entries
)
```

`ScopeForRole(role)` maps role strings to scopes:

| Role              | Scope         |
| ----------------- | ------------- |
| `AGENT` (default) | `own`         |
| `LEAD`            | `crew_shared` |

<Note>
  **Workspace isolation is always enforced** at the query boundary -- a cross-workspace recall is impossible even with a misconfigured scope.
</Note>

## Hybrid retrieval

Pure cosine recall misses entries that are lexically obvious — searching for `"OOM in checkout"` does not necessarily match an entry whose summary literally says `OOM`, because `nomic-embed-text` clusters by semantics, not surface form. PR #212 adds **hybrid retrieval**: dense (cosine) and sparse (BM25 over the FTS5 index) results fused via Reciprocal Rank Fusion.

```go theme={null}
hits, err := episodic.HybridRecall(ctx, db, embedder, episodic.Query{
    WorkspaceID: ws,
    AgentID:     agent,
    Scope:       episodic.ScopeOwn,
    QueryText:   "OOM in checkout",
    K:           5,
})
```

The fusion formula:

```
RRF_score(entry) = Σ over rankings r of:  1 / (k_RRF + rank_r(entry))
```

with `k_RRF = 60` (the value used in the original RRF paper; further tuning had marginal returns). Each ranking contributes the inverse of the entry's position; entries ranked highly in either leg float to the top, while entries that appear in both pump up most. There is no weighting parameter — both legs are treated equal, and the constant 60 dominates ranking noise at low ranks where it matters most.

`HybridRecall` is the recommended entry point for new code. The pure-cosine `Recall` is preserved for the orchestrator's prompt-injection path (where rank stability across small query perturbations matters more than recall on lexical hits) and for tests that want a deterministic similarity score.

## Memory relations (embedding relation graph)

Embeddings live in isolation by default; `memory_relations` lets us treat them as a graph. Two relation types are populated automatically:

* **`similar`** — `LinkSimilarOnIndex` runs at insert time: for the freshly-indexed entry, query the existing embedding pool, take the **top-3 cosine matches above 0.80**, write `(src=new, dst=match, relation='similar', weight=cosine)` rows. Edges are unidirectional (the new entry points to its predecessors); the inverse direction is implied at read time.
* **`supports`** — `LinkSupports` is called by the consolidator when it derives a rule from one or more journal entries. The rule entry points to every supporting evidence entry: `relation='supports', weight=1.0`. This is what makes consolidated rules traceable back to their evidence.

A future `contradicts` relation is reserved in the schema but not populated today.

Read-side use: when injecting recalled hits into a prompt, the orchestrator can optionally walk one hop along `similar` edges to deepen context without paying for another embedding round-trip. Disabled by default; enable per-agent by setting `agent.memory_walk_depth = 1`.

## Recall API

```go theme={null}
hits, err := episodic.Recall(ctx, db, embedder, episodic.Query{
    WorkspaceID: ws,
    CrewID:      crew,
    AgentID:     agent,
    Scope:       episodic.ScopeOwn,
    QueryText:   "production deployment keeper denied",
    K:           5,
})
```

* `WorkspaceID` is required.
* `ScopeOwn` requires `AgentID`; `ScopeCrewShared` requires `CrewID`.
* `K` caps results (1-50, default 5).

Returns `[]Hit`:

```go theme={null}
type Hit struct {
    EntryID   string
    Score     float64       // cosine similarity in [0,1], 1=identical
    Age       time.Duration // since original entry's ts
    Summary   string
    EntryType string
    AgentID   string
    Payload   map[string]any
}
```

Consumers typically weight fresher hits higher when injecting into prompts. `episodic.RenderInjection(hits, maxChars)` does this and returns a prompt-ready string, truncated to fit a char budget -- used by the orchestrator's `episodicRecallAdapter` before every agent run.

## Importance scoring

Each `journal_embeddings` row carries three additional fields that feed
recall ranking and a nightly decay job:

| Column                    | Purpose                                                     |
| ------------------------- | ----------------------------------------------------------- |
| `importance_score REAL`   | Multiplied into cosine at `Recall` sort time (range \[0,1]) |
| `reference_count INTEGER` | Incremented whenever a hit lands in a prompt                |
| `last_referenced_at TEXT` | Stamped at the same time                                    |

**Baseline** comes from `BaseImportance(entry_type, severity, priority)`
(see `internal/episodic/importance.go`): peer escalations and eval
regressions seed high; info-level routine events seed 0.5; the
operator-applied priority marker floors the value at 0.80 (pin) /
0.85 (high) / 0.95 (permanent).

**Nightly decay** — `episodic.DecayAndReinforce` recomputes every
row as:

```
importance = BASE × RecencyFactor × (1 + ReferenceBoost/8)
```

* `RecencyFactor(indexed_at, now) = max(0.1, 1 - days/180)` — a 0.1
  floor keeps old-but-critical memories from going to literal zero
* `ReferenceBoost(refs) = log₂(refs + 1)` — frequently-recalled
  entries lift, but the `/8` divisor prevents runaway loops from
  dominating over rare-but-critical ones

**Recall-time use** — `Recall` sorts candidates by
`cosine × importance` (not cosine alone), then top-K is returned.
Every returned hit gets `MarkReferenced` which increments the
reference counter so the next DecayAndReinforce lifts frequently-
hit memories.

The formula combines base value, recency decay, and reference-count
reinforcement to keep durable signal ranking well even as it ages.
Effect: a six-month-old peer.escalation keeps ranking well after
one recall per week, while a stale low-value info entry falls off.

## Untrusted-hints wrapper

`RenderInjection` wraps its output in a `<recalled-memory>…</recalled-memory>`
block with an explicit "UNTRUSTED HINTS" preamble. The wrapper is
load-bearing: recalled entries may contain text authored by peers,
tools, or agent output — a past `peer.escalation` could carry an
`"IGNORE PREVIOUS INSTRUCTIONS"` payload without anyone realising.
The wrapper instructs the model to treat everything inside as hints
the current task can override, not as authoritative instructions.

Same treatment applies to the orchestrator's `buildMemoryContext`
blocks (AGENT.md / CREW\.md) — both surfaces are agent-authored and
therefore both should be read by the model as hints.

The wrapper follows the "treat all recalled content as untrusted
hints" pattern: anything inside a `<recalled-memory>` block is
guidance the current task can override, never authoritative
instruction.

## Embedder

The `Embedder` interface is provider-neutral:

```go theme={null}
type Embedder interface {
    Embed(ctx context.Context, text string) ([]float32, error)
    Dim() int
    ModelID() string
}
```

Production wiring uses the Ollama adapter (`embedder.go`) against `nomic-embed-text` (768-dimensional vectors). The embedder shares the Keeper's Ollama base URL — `KEEPER_OLLAMA_URL`, which defaults to `http://localhost:11434` (the config falls back to that value when the env var is unset, so the embedder is configured by default). The embedder ends up nil only when that Ollama endpoint is unreachable at startup, which disables episodic recall — see the next paragraph.

If the embedder is nil (Ollama unreachable at startup), `Recall` returns an empty slice silently -- agent runs don't fail on embedding outages. This is configured in `server/orchestrator_adapters.go:newEpisodicRecallAdapter`.

## Health scoring

`memory_health_snapshots` captures one row per workspace per day (and optionally per crew). The score is a weighted sum of five metrics, each in `[0, 100]`:

| Metric           | Weight | Question it answers                                                                                                                     |
| ---------------- | ------ | --------------------------------------------------------------------------------------------------------------------------------------- |
| **Freshness**    | 25%    | How recent is the indexed content? Decays with median entry age.                                                                        |
| **Coverage**     | 25%    | What fraction of *embeddable* journal entries are actually indexed?                                                                     |
| **Coherence**    | 20%    | How tight is the embedding cluster structure? Higher = entries cleanly group around shared themes.                                      |
| **Efficiency**   | 15%    | Average cosine of the top-K hit distribution — high means recalls return distinct, relevant entries; low means the index returns noise. |
| **Reachability** | 15%    | Fraction of embeddings reachable via at least one `memory_relations` edge.                                                              |

The composite `score` is reported with three colour bands:

| Band   | Range   | Meaning                                                                                                                                                  |
| ------ | ------- | -------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Red    | `< 50`  | Memory is degraded; likely missing recent entries, low embedding count, or weak clustering. Operator action: trigger consolidation, check Ollama health. |
| Yellow | `50–75` | Functional but with at least one weak metric. Watch the dashboard.                                                                                       |
| Green  | `≥ 75`  | All metrics in healthy range.                                                                                                                            |

Health is exposed by `internal/api/memory_health_handler.go` over scores derived in `internal/episodic/`. The daily consolidator job recomputes it; operators can call it on demand via:

* HTTP — `GET /api/v1/memory/health[?crew_id=<slug>]`. See [Memory API](/api-reference/memory).
* CLI — `crewship memory health [--crew <slug>]`. See [`crewship memory health`](/cli/memory#health).

The endpoint returns the **latest snapshot** in the table — it does not recompute on read. Scores are refreshed by the daily consolidator; there is no on-demand recompute flag (`crewship memory health` accepts only `--crew`). To force a fresh score sooner, trigger consolidation (`crewship consolidate run`).

## In-session memory nudge

When an agent run accumulates **60 or more new journal entries** since its last `memory.updated` entry, the orchestrator injects a short reminder into the next system prompt:

> You have 72 new journal entries since your last memory update. Consider appending any recurring pattern you've noticed to `~/.memory/AGENT.md` before the session ends — the consolidator won't replace your personal observations.

The threshold (`nudgeThreshold` in `internal/orchestrator/memory.go`) was raised from 30 to 60 because at 30 the nudge fired on essentially every session after a memory write. The nudge only fires once per session — once the agent emits a new `memory.updated`, the counter resets.

The nudge is advisory — agents can ignore it, and it is stripped from the context before the LLM call records its response payload.

## Indexer

The `Indexer` (`internal/episodic/indexer.go`) runs a background sweeper loop on a **30-second poll** (`NewIndexer(..., poll)`), processing up to 64 unindexed embeddable entries per `sweepOnce`.

The server starts this sweeper automatically at boot when an embedder is configured (`startEpisodicIndexer` in `internal/server/server_lifecycle.go`, gated on `KEEPER_OLLAMA_URL`). When no embedder is configured the server instead logs one WARN at boot and reports `"episodic": "sparse-only"` on `/healthz`; `crewship doctor` reads that field and warns with the enable hint.

Hot-path callers that want an embedding ready before the next recall call `Indexer.IndexOne(ctx, entry)` directly — typically right after writing a `summary.generated` entry:

```go theme={null}
err := x.IndexOne(ctx, entry)
```

Only entries whose type survives the coarse SQL filter (the `EmbeddableEntryTypes` slice) are candidates; the Go-side `shouldEmbed` then applies the severity-aware refinement.

## Gotchas

<Warning>
  **Do not add `exec.output_chunk` to the embeddable list.** Doing so will flood the embedding space with low-signal log chatter and kill recall quality. The comment in `types.go` is not a suggestion.
</Warning>

* **Ollama dependency.** No Ollama = sparse-only recall (no vector similarity). Set `KEEPER_OLLAMA_URL`, plus `OLLAMA_MODELS="/Volumes/SSD 990 PRO/ollama-models"` (external SSD) and `ollama serve` before `./dev.sh start` when testing locally. Check the active mode via `GET /healthz` (`episodic` field) or `crewship doctor`.
* **Embedding is best-effort.** An embedder error during a `sweepOnce` pass logs and skips; the entry is retried next sweep.
* **Cosine on small sets is fine.** Don't pre-optimise for a vector DB until the per-agent scan latency exceeds 50ms.

## Related

* [Crew Journal](/guides/crew-journal) -- source of entries.
* [Consolidate](/guides/consolidate) -- the nightly workers that produce many of the embeddable entry types (`summary.generated`, `memory.consolidated`).
