> ## Documentation Index
> Fetch the complete documentation index at: https://docs.crewship.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Consolidate

> Nightly workers that compact old journal entries and extract semantic rules into crew memory.

# Consolidate

Two background workers run against the [Crew Journal](/guides/crew-journal). Both start automatically at server boot (`consolidate.StartBackground`) on the cadences below; you can also trigger a consolidation run on demand via the API/CLI.

1. **Consolidator** — runs every 6h. Walks recent journal entries per crew and produces a set of learned rules appended to `.memory/topics/learned-YYYY-MM-DD.md`. Emits `memory.consolidated`.
2. **Compactor** — runs daily at 03:00 UTC. Rolls up low-signal entries older than 30 days into a single `system.compaction` entry and deletes the originals. `warn`/`error` rows are preserved indefinitely.

Each degrades gracefully when its inputs are absent: no summarizer (auxiliary LLM) configured → the consolidator loop still runs but produces no rules; the compactor always runs its retention sweep.

<Note>
  The "supervisor crew" orchestration pattern (a dedicated crew that reviews and curates consolidated rules) is on the roadmap and not part of the current automatic workers.
</Note>

## Candidate entry types

The consolidator considers these entry types for rule extraction:

```go theme={null}
var candidateTypes = []journal.EntryType{
    journal.EntryPeerEscalation,
    journal.EntrySummaryGenerated,
    journal.EntryKeeperDecision,
    journal.EntryMissionStatus,
    journal.EntryEvalRegression,
}
```

Everything else (exec chunks, metrics, broadcast noise) is explicitly excluded so the LLM prompt stays dense and the rules grounded in decisions rather than activity.

## Rules output

Rules land in `{memoryRoot}/{crewSlug}/topics/learned-YYYY-MM-DD.md`. Format:

```markdown theme={null}
# Learned Rules 2026-04-17

## Rule: prefer staged deployments for schema changes
**Source:** peer.escalation j_a1b2..., mission.status_change j_xyz...
**Confidence:** 0.8

When a migration adds a non-nullable column, roll it out in two steps
(add nullable, backfill, alter to NOT NULL) to avoid production downtime.

## Rule: ...
```

The [Lead](/guides/first-crew)'s crew memory search (FTS5) picks these up automatically -- see [Memory System](/guides/agent-memory).

## Summarizer

`SummarizerClient` is an interface:

```go theme={null}
type SummarizerClient interface {
    Summarize(ctx context.Context, req SummarizeRequest) (SummarizeResponse, error)
}
```

Production wiring uses the Ollama adapter against a chat model (typically `llama3.1:8b` or similar). If no summarizer is configured:

* Background runner skips every crew silently.
* Manual trigger returns **202 Accepted** with `note: "no summarizer configured, skipping"` and emits `system.consolidation_triggered` + `system.consolidation_completed` (`status=skipped-no-summarizer`) so the audit trail captures the no-op.

## Compactor

Compaction target: `info` and `notice` entries older than 30 days that are NOT type `summary.generated`, `memory.consolidated`, or any checkpoint/fork/approval/regression. Rolls up into one `system.compaction` entry per (crew, day) bucket with a count and kind breakdown in the payload, then **archives** the originals to `journal_entries_archived` (with the payload truncated to 400 chars) before deleting from the base table. Done in a single transaction so neither side can drift.

Preserved indefinitely (never compacted):

* `warn` and `error` severity
* `summary.generated`, `memory.consolidated`
* Everything under `approval.*`, `checkpoint.*`, `fork.*`, `eval.regression_detected`
* Anything tagged with `priority='permanent'` (see [Crew Journal — Priority markers](/guides/crew-journal#priority-markers))

### Archive layer (PR #212)

Migration 55 added `journal_entries_archived` so compaction is no longer destructive. The aged row is copied verbatim except for `payload`, which is truncated to 400 characters with a trailing `…[truncated]` marker. The archive table has the same workspace/ts indexes as the base, so historical forensic queries still work — they just have to include the archive table:

```sql theme={null}
SELECT id, ts, entry_type, summary
FROM (
  SELECT id, ts, entry_type, summary FROM journal_entries
  UNION ALL
  SELECT id, ts, entry_type, summary FROM journal_entries_archived
) WHERE workspace_id = ?
ORDER BY ts DESC
LIMIT 100;
```

The frontend `/journal` view does **not** read from the archive by default — operators have to explicitly opt in via `?include_archive=true`. This keeps the live UI fast and the archive as a forensic-only surface.

If you want to tune journal compaction retention, edit `RunnerOptions.CompactionOlderThan` and restart. Per-workspace retention for `memory_versions` (a separate sweep, run by the Iter 4 coordinator) is configurable at runtime via `PATCH /admin/memory/config` — see [Memory Observability — Per-workspace retention](/guides/memory-observability#per-workspace-retention).

## Manual trigger

```
POST /api/v1/consolidate/run
Body: {"crew_id": "...", "since": "24h"}  // both optional
```

`OWNER` or `ADMIN` only. Returns **202 Accepted** immediately; the actual run happens in a 10-minute background goroutine.

Response shapes:

| Condition                               | Response                                                               |
| --------------------------------------- | ---------------------------------------------------------------------- |
| Run started                             | `{"triggered": true, "worker_id": "csd_..."}`                          |
| No summarizer configured                | `202 {"accepted": true, "note": "no summarizer configured, skipping"}` |
| Consolidator not wired (dev/test build) | `503 {"error": "consolidator not configured"}`                         |
| Same workspace already running          | `409 {"error": "already running"}`                                     |
| Invalid / cross-tenant crew\_id         | `404 {"error": "crew not found"}`                                      |

`since` accepts Go durations (`90m`, `24h`) plus shorthand `d` (days) and `w` (weeks). Default 24h if unset or unparseable. Soft-deleted crews are treated as not found -- a deleted crew can't grow fresh memory artifacts via explicit ID.

## CLI

```bash theme={null}
crewship consolidate run
crewship consolidate run --crew cmo2pe4dj0005ba0a129f
crewship consolidate run --since 24h
crewship consolidate run --since 7d
```

Full reference: [`crewship consolidate`](/cli/consolidate).

## Journal entries emitted

| Entry                            | When                                                                                                                                             |
| -------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------ |
| `system.consolidation_triggered` | On `Run()` entry. Payload includes `worker_id` and `reason` (`manual` or `scheduled`).                                                           |
| `system.consolidation_completed` | On `Run()` exit. Payload includes `status` (`ok`, `skipped-no-summarizer`, `crew-not-found`, `enumerate-failed`), `crews_run`, `rules_appended`. |
| `memory.consolidated`            | Per crew, per successful cycle. Payload lists `rules_count` and paths to the output file.                                                        |
| `system.compaction`              | Per (crew, day) bucket after compaction. Payload lists kinds and row counts rolled up.                                                           |

## Gotchas

* **Ollama must be reachable.** Both the consolidator (via `SummarizerClient`) and the episodic indexer depend on Ollama. Set `OLLAMA_MODELS="/Volumes/SSD 990 PRO/ollama-models"` and `ollama serve` before starting the server locally.
* **The triggered journal entry is always emitted.** Even the "no summarizer" skip path emits both triggered + completed so operators see the full round-trip. If you filter for `memory.consolidated` only, you'll miss skipped runs.
* **Per-workspace in-flight guard.** A second manual trigger for the same workspace while the first is running returns 409. Track `worker_id` if you need to wait for completion.
* **Compactor is aggressive on info/notice.** If you rely on `exec.command` entries older than 30 days for forensics, export them out-of-band before the Compactor rolls them up.

## Related

* [Crew Journal](/guides/crew-journal) -- input stream.
* [Episodic memory](/guides/episodic-memory) -- the vector index built from the entry types these workers emit.
* [Memory System](/guides/agent-memory) -- consumes the generated `topics/learned-*.md` files.
* [`crewship consolidate`](/cli/consolidate), [Consolidate API](/api-reference/consolidate).
