Memory System
Overview
Crewship has a 3-tier memory hierarchy that gives agents persistent knowledge across sessions and shared context within crews. Each agent has its own private memory the LLM can read and write, every crew has a lead-owned shared memory all members read, and an ephemeral session tier lives only in the live context window. The three tiers compose into a single retrieval call at agent-run time — the orchestrator picks chunks from whichever tier fits within the per-prompt budget, ranks them by relevance, and stitches them into the system prompt before the LLM ever sees the user’s message. Memory is local-first with zero external dependencies. No embedding APIs, no vector databases, no cloud services. Retrieval is powered by SQLite FTS5 (BM25 ranking) running sub-millisecond on thousands of chunks, with the hybrid retrieval uplift from PRs #211/#212 layering dense scoring on top via RRF (reciprocal rank fusion) when an in-process embedding model is configured. The whole memory subsystem fits ininternal/memory/ plus the consolidation worker in internal/consolidate/, both running inside crewshipd — no extra container, no extra port.
The point of three tiers (instead of one big pile) is to keep the who-owns-what model clear. An agent can freely write to its own tier without affecting peers. The crew lead is the single writer for crew-shared memory, which means the FTS5 index has zero write contention and crew-wide context evolves under one editorial voice. And session memory is intentionally not persisted — context that’s irrelevant the next time the agent is invoked stays out of the index, keeping retrieval fast as the deployment ages.
When to use it
Turn memory on for any agent or crew whose value compounds across sessions. The system is opt-in (memory_enabled=false by default) precisely because the budget cost in the system prompt is real — only pay it where it earns its keep:
- An agent has long-running user-specific preferences. Dark-mode preference, preferred PR style, favourite framework, “use PostgreSQL 16 for new DBs” — anything the user shouldn’t have to repeat on every new chat. Persist to
AGENT.mdonce, retrieve forever. - A crew needs a single source of truth for conventions and decisions. Architecture decisions, deployment rules, security policies, “we always use Tailwind, never inline styles” — the lead curates
CREW.mdand every member-agent reads it. Beats stuffing the same instructions into each agent’s system prompt by hand. - An agent is accumulating project context. A code-review agent reading the same repo every day shouldn’t relearn the module layout each session. Daily logs (
daily/2026-05-14.md) capture session observations; FTS5 search surfaces them when the same module comes up again. - A single-agent crew that should specialise over time. Without crew peers the budget is 100% agent-tier — the agent essentially gets a persistent notebook. Useful for “my personal research assistant” or “the on-call bot that remembers last week’s incidents.”
- You’re domain-mining knowledge with
topics/. When the crew has bounded subject areas (deployment, security, billing), populatetopics/<name>.mdfiles. Hybrid search ranks them above the daily noise on relevant queries.
Key concepts
| Term | What it means here |
|---|---|
| Memory tier | One of agent, crew, session. Each is a separate FTS5 index and a separate slice of the system-prompt budget. agent is per-agent private, crew is lead-owned shared, session is the live LLM context window (not persisted). |
AGENT.md | The canonical long-term knowledge file for one agent. Lives at /crew/agents/{slug}/.memory/AGENT.md inside the container, ~/.memory/AGENT.md from the agent’s POV. The agent is instructed to append rather than overwrite, and to only remove entries that are explicitly outdated. |
CREW.md | The crew equivalent of AGENT.md. Lives at /crew/shared/.memory/CREW.md. Only the lead writes; every agent in the crew reads. Curated, single-voice — not a free-for-all. |
daily/ | Per-day session notes (daily/2026-05-14.md). Captures what was done, decisions, observations. Both agent and crew tiers have a daily/ folder; the lead writes the crew one. Searchable via FTS5; consolidated into the evergreen *.md over time by the Consolidation worker. |
topics/ | Crew-only. Domain-specific knowledge files (topics/deployment.md, topics/security.md) that accrete bounded subject expertise. Hybrid retrieval ranks them above daily noise on on-topic queries. |
| FTS5 index | SQLite’s full-text-search-5 extension. Each tier has its own index.sqlite file managed by the sidecar. BM25 ranks results — relevance scoring used by virtually every modern search system since Elasticsearch. |
| Hybrid retrieval (RRF) | When an in-process embedding model is configured, dense vector scoring runs alongside BM25 and the two ranked lists are fused via reciprocal rank fusion. From PR #212; sub-millisecond on tens of thousands of chunks because nothing leaves the process. |
| Budget | The character allowance for memory content in the system prompt. Default 15,000 chars. The crew tier is capped at 40% of total; the agent tier reclaims the remainder. Empty tiers reclaim their entire share. |
| System-prompt injection | The orchestrator’s BuildMemoryBlock reads the top-K chunks at session start and prepends them between [AGENT MEMORY] / [END AGENT MEMORY] and [CREW SHARED MEMORY] / [END CREW SHARED MEMORY] markers. The LLM sees memory as part of its system prompt; the agent never has to ask for it. |
| Sidecar memory engine | Each agent container’s sidecar process (localhost:9119) owns the agent’s FTS5 index and proxies search to the lead’s sidecar for crew-tier queries. The lead sidecar reindexes the crew tier every 60 seconds. |
memory_enabled | Per-agent flag. false by default — memory tiers are not read, written, or indexed for that agent. Flip via crewship agent update --memory. |
| Consolidation | The nightly worker in internal/consolidate/ that promotes recurring observations from daily/*.md into AGENT.md / CREW.md and archives the original log. Keeps the evergreen tier from drifting stale and the daily tier from growing unboundedly. |
Usage
The end-to-end loop from “memory off” to “agent recalls last week’s decision” is five steps. Each step links down to the deeper detail section below.Enable memory on the agent
Memory is The next time the agent runs in a session, its sidecar provisions
memory_enabled=false by default. Flip it on at create or update time:~/.memory/AGENT.md (and the crew tier if the agent belongs to one) and starts the FTS5 index.Let the agent write to its own memory
No special API needed — the agent writes by running shell or file-tool commands inside its container. The two common patterns:The crew lead writes to
/crew/shared/.memory/CREW.md and /crew/shared/.memory/topics/<name>.md the same way — see File Structure for the full layout.Subsequent sessions retrieve memory automatically
When the orchestrator builds the system prompt for the agent’s next session,
BuildMemoryBlock selects the top-K chunks within budget and injects them between [AGENT MEMORY] / [CREW SHARED MEMORY] markers. The agent doesn’t have to ask — it reads the markers as part of normal context. See Reading Memory for the exact prompt shape.Search explicitly when you need to
For mid-session lookups outside the initial prompt budget, the agent calls its sidecar’s search endpoint:
scope is one of agent, crew, both. With both, BM25 (and dense scores, when an embedding model is configured) are RRF-merged across the two engines and each result carries its origin.Let consolidation tend the index
The Consolidation worker runs nightly: recurring observations from
daily/*.md get promoted into the evergreen AGENT.md / CREW.md, and the daily logs get archived. You don’t run it by hand for normal operation — the workspace operator can trigger one off-cycle via crewship consolidate run --crew=<crew> (optionally scoped with --since) if a tier is visibly drifting.For development without a running server, the crewship memory CLI lets you reindex and search on-disk memory directly — useful for poking at a backup or a frozen container.Examples
A research-assistant agent that remembers user preferences
You spin upresearcher, ask it five PR reviews. Over the course of a week the agent picks up that you write tight commit messages and prefer the changelog updated in the same PR rather than later. On day one it appends:
BuildMemoryBlock injects AGENT.md into the system prompt before the LLM ever sees your message, so the agent’s first review comment already says “I noticed the CHANGELOG entry is missing from this PR” without having to be told.
A crew lead recording an architectural decision once
Yourbackend crew decides Postgres 16 is the new baseline. The lead writes it once:
migrations, api, tests) reads that line on their next session via the crew-tier system-prompt block — the lead doesn’t have to update three system prompts, and a new agent joining the crew picks up the same context for free.
Bounded subject expertise via topics/
A support crew handles deployment, billing, and security questions. Rather than letting CREW.md grow into a sprawling FAQ, the lead carves it up:
topics/deployment.md chunks at the top — the agent gets the right slice and not all 12kB at once. CREW.md stays small and editorial; the long-tail expertise lives in dedicated topic files that grow without bloating the budget.
Memory Tiers
A workspace-scope memory tier (org-level strategy, cross-crew knowledge) is on the v0.2 roadmap.
Enabling Memory
Setmemory_enabled to true when creating or updating an agent:
Memory is disabled by default. Each agent has its own isolated memory store — agents cannot read each other’s personal memory.
File Structure
Agent Memory
Lives in~/.memory/ inside the agent’s container, mapped to /crew/agents/{slug}/.memory/ on the filesystem.
Crew Shared Memory
Lives at/crew/shared/.memory/ inside every agent container in the crew. The Lead agent writes crew-wide knowledge; all agents can read it.
The Lead’s sidecar owns the crew FTS5 index and reindexes it every 60 seconds automatically. Non-lead agents search crew memory via the sidecar HTTP API.
How It Works at Runtime
1. Writing Memory
Agents write memory by executing file operations inside their container — shell commands, file tools, or editors. No special API needed.memory.write tool the sidecar exposes over MCP. It is durable and fail-closed:
- A successful
memory.writeisfsync’d and atomically renamed into place before it returns — a write the tool reports as done is on stable storage, never merely in the page cache. - If the persist cannot complete, the tool returns an
isErrorresult (the model sees the failure and reports it) instead of a false “done”. Success therefore means durability. - The tool is health-gated:
memory.write/memory.readare only advertised to the model when the memory sidecar is actually reachable. If the sink is down, the model doesn’t see a memory tool at all and degrades explicitly (“I couldn’t persist that”) rather than calling a dead endpoint and getting a phantom success. Because sidecars are container-scoped and persistent, any warm crew container keeps the tools available across runs.
Recall (the
[AGENT MEMORY] injection below) reads the file directly and does not depend on the sidecar, so an agent always reads back its durable memory even on a path that skips the sidecar. Writing mid-session is what needs the sink — see Memory observability for the authoritative way to verify a write landed (and why memory hybrid can lag it).2. Reading Memory (System Prompt Injection)
When a session starts, the orchestrator reads memory files and injects them into the system prompt:2a. Operator model (per-user)
Alongside the three memory tiers, Crewship distils a small operator model for each person who works with a crew. It captures how an operator likes to work — communication style, timezone, language, level of detail — as a hint, not as a fact about who they are or what they want. The model is keyed on (operator, workspace), independent of which agent is answering. Every agent in a crew reads the same model, stored crew-shared at:user_slug is a one-way hash (sha256(user_id ‖ workspace_id)[:16]), so the
filename never carries the operator’s identity into a directory listing or a
stack trace. The same hash gives each workspace its own isolated model for the
same person — a user who works in two workspaces gets two independent models.
When a session opens, the orchestrator injects only the session opener’s
model — never another operator’s, even if one exists on disk — as an
[OPERATOR MODEL] block, placed before the per-agent [PEER CONTEXT] block
so the broad working-style hint frames the narrower per-agent relationship hint:
3. Searching Memory
Agents search memory via the sidecar HTTP API with ascope parameter:
Dynamic Budget Allocation
The orchestrator allocates a character budget (default 15,000 chars) across memory tiers. Empty tiers reclaim their budget for lower tiers.| Agent Role | Crew (max) | Agent (remainder) |
|---|---|---|
| Lead (in crew) | 40% | rest |
| Agent (in crew) | 40% | rest |
| Single-agent crew | — | 100% |
Budget examples
Budget examples
Lead with small crew memory:
- Crew: 2,000 chars (actual content, under 40% cap)
- Agent: 13,000 chars (reclaimed from crew)
- Agent: 15,000 chars (full budget)
API reference
The memory subsystem is reachable on the agent’s sidecar atlocalhost:9119 inside the container. There is no public REST endpoint — these calls are intentionally container-local so an agent can only ever query its own tiers (and, via the lead’s sidecar proxy, the crew tier it’s a member of). The schemas below are the source of truth; for the database-side view of the underlying tables see /api-reference/journal (journal-backed retrieval) and the SQLite migration log under internal/database/migrate.go (v54/v55 introduced the importance + FTS5 columns these endpoints read).
POST /memory/search
Search indexed memory with scope control. Request:| Field | Type | Required | Default | Description |
|---|---|---|---|---|
query | string | Yes | — | Search terms (FTS5 syntax) |
limit | int | No | 10 | Max results (capped at 50) |
scope | string | No | "agent" | "agent", "crew", or "both" |
scope=both, results from both engines are merged and sorted by BM25 score. Each result includes a source field indicating its origin.
| Status | Condition |
|---|---|
| 400 | Missing query, invalid JSON, or invalid scope |
| 503 | Requested memory engine not available |
| 500 | Search execution failed (all scopes) |
GET /memory/status?scope=agent|crew
Check the state of a memory index. Response (200):POST /memory/reindex?scope=agent|crew
Trigger a full reindex. Context-aware — responds to client disconnect and SIGTERM. Response (200): Returns status object after reindexing.CLI (Development & Debugging)
Thecrewship memory command provides direct filesystem access to memory indexes without a running server.
| Flag | Short | Description |
|---|---|---|
--path | -p | Base path (meaning depends on scope) |
--scope | -S | agent or crew (workspace scope ships in v0.2) |
--limit | -l | Max results (search only, default 10) |
--format | -F | table or json (search only) |
Access Control
| Operation | Agent | Lead |
|---|---|---|
| Read own agent memory | Yes | Yes |
| Write own agent memory | Yes | Yes |
| Read crew shared memory | Yes | Yes |
| Write crew shared memory | No | Yes |
Limits
| Limit | Value | Description |
|---|---|---|
| Total memory size | 10 MB per agent | Maximum .memory/ directory size |
| Daily log size | 100 KB per file | Maximum single daily/*.md file |
| Crew memory reindex | Every 60 seconds | Automatic periodic reindex by Lead sidecar |
Load-time injection scan
Memory files are authored by prior agent runs, so a file on disk can carry an indirect-injection payload (text ingested from a web fetch, a peer message, or a file read, then persisted). Every memory tier is scanned again at prompt assembly time, immediately before its content is placed into the system prompt — independent of the write-path scan. The scan runs per section. If a section’s body trips the scanner, only that section’s body is replaced with a deterministic notice; clean sibling sections in the same block are untouched:- The live file is never modified. The substitution happens only in the assembled prompt. The file on disk is left exactly as written so you can open it and judge the content yourself — the notice points you at the file rather than discarding anything.
- The label is preserved so you can tell which tier and which file produced
the hit, and the
category/patternmap back to the scanner rule. - It is deterministic. The same file content always yields the same notice (first-hit, fixed rule order), so a blocked section won’t flicker between runs.
Memory write overflow guidance
Thememory.write tool is a pure bounded store — each tier has a fixed byte cap
and the store never silently evicts or rewrites entries for the agent. When a
write would cross a cap, the tool now hands the agent enough to fix it within
the same turn instead of just rejecting the call:
- Hard cap exceeded (the write would push the tier past its cap): the result
is an error, nothing is written, and the result metadata carries
current_entries(the current on-disk body) andusage(e.g.3900 of 4000 bytes, 97%). The message instructs the agent to consolidate the current entries — merge duplicates, drop stale lines, summarize — and retry the write in this turn withmode='replace'carrying the consolidated body. - Soft cap (80%) crossed (the write succeeds but is close to the cap): the
same
current_entries+usageare attached and the warning steers the agent to consolidate and rewrite the consolidated body in this turn, before the next append is rejected.
append and replace are at parity — both surface current_entries + usage
on overflow. This keeps the store dumb (it does not consolidate for the agent)
while giving the agent the material to self-curate without losing the turn.
Best Practices
Writing effective AGENT.md
Writing effective AGENT.md
- Use clear headings (
## Identity,## Learned Facts,## Preferences,## Project Context) - Be specific. “Use driver name
sqlitenotsqlite3” beats “there’s a sqlite driver thing” - Keep it under 5 KB. Concise memory leaves budget for daily logs and crew context
- Prune outdated facts. Stale memory is worse than no memory
Writing effective crew memory (CREW.md)
Writing effective crew memory (CREW.md)
- Lead curates CREW.md with crew-wide conventions, architecture decisions, and policies
- Use topics/ for domains.
topics/deployment.md,topics/security.mdkeep CREW.md focused - Agents don’t write to crew memory — they write personal notes to their own AGENT.md
- Don’t duplicate facts across agent and crew memory. Crew memory is for shared knowledge only
Optimizing search
Optimizing search
- Use consistent terminology. FTS5 is keyword-based, not semantic
- Trigger reindex after bulk writes. Agent memory reindexes on startup; crew memory reindexes every 60s
- Use
scope=bothto search across personal and crew knowledge simultaneously
Common pitfalls
Memory is
memory_enabled=false by default. The most common “my agent isn’t remembering anything” report is just an agent that was never opted in. Check crewship agent get <slug> and flip the flag if needed.- Crew reindex lags up to 60 seconds. The lead sidecar reindexes the crew tier on a 60-second tick. Writes to
CREW.mdortopics/*.mdare searchable on the next tick, not immediately — if you need instant visibility, hitPOST /memory/reindex?scope=crewfrom the lead’s sidecar. - Agent-tier reindex after
/memory/writeis asynchronous. A successfulPOST /memory/writereturns201the moment the bytes are durable on disk; the FTS5 reindex (and thememory.updatedjournal entry) then run on a single-worker background queue rather than blocking the response. The lag is sub-second in practice. The queue is strict FIFO, so consecutive writes to the same file reindex in write order — turn N always lands before turn N+1, never a stale earlier write winning a race. Writes are still searchable essentially immediately; if you need a hard barrier (e.g. write-then-search in a tight loop), the queue is drained on sidecar shutdown so nothing is lost, but for instant in-session visibility you can hitPOST /memory/reindex?scope=agent. - Single-agent crews have no crew tier. The
crewscope returns 503 for an agent that isn’t a member of a crew (or whose crew has only itself). The budget for the agent tier is 100% in that case; no crew context to retrieve. - Agents cannot read other agents’ personal memory — even within the same crew. Container bind mounts enforce this at the filesystem level, not by API check. If a workflow needs cross-agent state, the right surface is
CREW.mdortopics/, written by the lead. - Budget is measured in characters, not tokens. The default 15,000-character budget is ~3,500–4,500 tokens depending on language and content. Don’t reason about budget as if it were tokens — the orchestrator slices on character boundaries.
- The 10 MB per-agent memory cap is a hard limit. Once a directory exceeds it, new writes fail rather than evicting old ones — consolidation isn’t automatic eviction. Curated facts only, not raw logs.
- Don’t delete
AGENT.mdentries casually. Memory is append-mostly. Remove only entries that are explicitly outdated — pruning live knowledge is what makes an agent feel “dumber than yesterday”. - Migration v54/v55 must be applied for hybrid retrieval. Older databases that predate the Crew Journal era (PRs #211/#212) don’t have the
importance_score,reference_count, orjournal_entries_ftscolumns these endpoints read. Migrations apply automatically oncrewship start, so start the upgraded binary once before depending on RRF scoring. - Docker network
Internal: trueblocks sidecar → embedding model. If the agents network isInternal: true, the sidecar can read its on-disk FTS5 index but cannot reach a remote embedding service if one is configured, silently degrading hybrid retrieval to BM25-only. The fix lives ininternal/provider/docker/docker.go:ensureNetwork— the network must beInternal: falsewith a gateway.
Related
- Episodic memory — the hybrid retrieval (RRF) + importance/decay/reinforce uplift from PRs #211/#212 that this guide builds on.
- Consolidate — the nightly worker that promotes recurring
daily/*.mdobservations intoAGENT.md/CREW.mdand archives the original logs. - Crew Journal — the event log behind the FTS5 mirror and the
journal_embeddingstable memory reads from. - Keeper — where secrets belong instead of memory files (memory is not encrypted at rest).
- Orchestration — the runtime that calls
BuildMemoryBlockto inject memory into system prompts at agent-run time. - Skills — for agent-side packaged knowledge that doesn’t change per-user/per-crew; complementary to memory rather than overlapping.