Skip to main content

Inbox

The Inbox is your unified human-in-the-loop feed. Anything an agent — anywhere across your crews — needs you to look at lands here, in one ordered list, with the action you’d take inline on the detail panel. No tab switching, no hunting through /issues for the one paused routine, no separate “approvals queue” page to remember. It lives at /inbox in the running web UI and is the first item in the Plan group of the sidebar. The bell in the top bar shares its unread count with the sidebar badge — both update over WebSocket so a peer triaging an item in another tab clears your bell instantly.

What lands in the Inbox

Four kinds of items get written through to inbox_items (migration v85) the moment their source signal fires:
KindWhen it’s createdResolves via
waitpointA routine hits a wait step of kind: approval (Harbormaster gate)Approve / Deny — calls /pipelines/waitpoints/{token}/approve
escalationAn agent or sub-task calls /escalate on the sidecar, or internal/api/escalation_handler.go records oneMark resolved (decision is recorded in the escalation row separately)
failed_runA pipeline_run reaches the failed terminal stateRetry (re-fires the routine with the same inputs) or Cancel
messageThe orchestrator broadcasts a notification — e.g. “ENG-12 ready for review” — addressed at a user or role. Also carries “your agent replied” chat notifications: when an assistant reply lands in a session you aren’t currently viewing, one per-user item appears with a scrubbed preview (repeated replies refresh it in place instead of stacking). See Chat & Sessions.One-click jump to the linked issue or the Open chat deep link, or Dismiss. Opening the chat (or crewship chat read) clears it automatically.
Chat-reply items get priority placement: under Smart grouping they form their own Agent replies group directly beneath Decisions needed (never buried in FYI), and each row shows an Open chat quick action on hover that deep-links straight into the session — continue the conversation in one click; opening it clears the item. Items are written by source-of-truth handlers, not synthesised by the inbox. So:
  • Approving a waitpoint via /inbox cascades back through pipeline.waitpoints.CompleteApproval — the run resumes, the journal records the decision, and the inbox row flips to resolved via inbox.ResolveBySource.
  • Retrying a failed run from /inbox POSTs to /api/v1/workspaces/{ws}/pipelines/{slug}/run with the original inputs (captured at failure time on the row’s payload.inputs). The new run shows up in /activity immediately.
  • Marking an escalation resolved on the inbox only flips the inbox state — the escalation itself lives in its own table and is closed via crewship escalation resolve (or the escalation lifecycle API).
The inbox is a projection, not a queue. The authoritative state lives on the source row (waitpoint token, escalation id, pipeline_run). The inbox row exists to give you one place to see “is anything waiting on me right now?” and a one-click handle to resolve it.

States

The lifecycle is unread → read → resolved, surfaced through three tabs with Gmail-style semantics: opening a row marks it read but keeps it in your inbox — it only leaves when you Archive it.
unread ──(click row)──> read ──(Archive / action)──> resolved
   │                                                     ▲
   └──────────────── "Mark unread" / "Restore" ──────────┘
  • Inbox (default tab) shows everything that isn’t archived — unread and read. Clicking a row marks it read in place; it does not disappear. This is the fix for “I clicked a message, clicked away, and lost it.”
  • Unread is a filter, not the default. Archived is the resolved bucket.
  • Archive is the Gmail move: clear a row from the inbox without recording a decision. It maps to resolved with resolved_action="archived", shows an Undo toast, and is restorable from the Archived tab (or with crewship inbox unread <id>). Archive is offered on non-decision rows (messages, failed-run notices, advisories) and on keeper-synthetic escalations (skill review / memory health) — those have no backing escalations row and thus no source endpoint, so the inbox row is the only handle. A real waitpoint/agent escalation is source-managed and still needs an explicit Approve/Deny/Resolve.
  • Resolving via an action button (Approve, Deny, Retry, Cancel, Dismiss) stores the shape on resolved_action so the audit trail shows what you decided, not just that you closed it.
  • “Mark unread” flips a row back; “Restore” un-archives. The bell count and sidebar badge both react.

The /inbox page

A two-pane layout:
PanePurpose
Left list (420 px)Item rows ordered newest-first. Three tabs at the top — Inbox, Unread, Archived. Each row shows a sender avatar (agent / crew / routine / system, coloured per sender), an unread dot, the title, the kind, sender, and a relative timestamp. A Group control buckets the list by Smart (Decisions needed → Needs review → FYI), Type, Sender, Routine, Issue, or Crew.
Right detailSender avatar + name + role, a rendered-markdown body (headings, lists, code — not a raw blob), a humanised Context card (key/value, not raw JSON, with credential-looking values masked behind a reveal toggle), kind-specific action buttons, an Archive button on non-decision rows, and — for waitpoints — a WaitpointRunDetail block showing which step is paused.
The detail panel’s action row is what the source-of-truth call hangs off:
  • Waitpoints show Approve + Deny. Both hit /pipelines/waitpoints/{token}/approve; the boolean approved in the body disambiguates. (An empty body decoded to approved=false because Go’s json.Unmarshal gives bools their zero value when absent — the inbox always sends the explicit value.)
  • Failed runs show Retry + Cancel. Retry replays the routine’s inputs_json. If the payload is missing a pipeline_slug for some reason, the row falls back to “cancelled” rather than getting wedged.
  • Escalations split two ways. A real agent escalation (carries an escalation_type and a backing escalations row) shows Approve / Reject — the decision is recorded through the escalation lifecycle, and the inbox just mirrors it. A keeper-synthetic escalation (skill review, memory health — kind=escalation but no backing row, so no /escalations/{id}/resolve endpoint) has no inline decision; the detail points you at Archive to clear it. These are the one escalation type the inbox lets you dismiss directly, because there is no source to defer to.
  • Messages with an issue_identifier in payload show Open <identifier> as a direct link into /issues/&lt;id&gt;.
When a row is resolved, the detail panel dims, the action row replaces itself with a “Resolved Xm ago · approved” line (or “Archived Xm ago · Restore” for archived rows), and the row moves to the Archived tab.

Readability & safety

The inbox is the one surface that aggregates messages from every agent, routine, and curator — so it’s also where raw machine output and leaked secrets would otherwise pile up. Several guards keep it human-readable and safe:
  • Sender avatars. When the sender is a real agent, the row and detail header render that agent’s actual avatar — the same DiceBear seed/style its agent card uses, so “casey escalated…” reads as Casey’s face. The List/Get responses fill avatar_seed + avatar_style for agent senders via a batched lookup against the agents table (no schema change, works retroactively). Non-agent senders (crew / pipeline / system) fall back to a coloured glyph keyed on sender_type, so an agent message stays visually distinct from a routine failure.
  • Rendered bodies, not blobs. body_md is rendered as markdown (MarkdownContent), and the payload is shown as a humanised key/value Context card instead of a raw JSON.stringify dump.
  • Sanitised advisories. Curator routines (skill review, memory health) used to paste the gatekeeper’s internal decision text — e.g. Keeper LLM unavailable: paymaster: workspace_id required — deny by default — straight into the body. inbox.SanitizeReason now swaps that for a friendly line and keeps the raw text in payload.raw_reason for operators. When the cause is an infrastructure outage (curator LLM down), a pure advisory is suppressed entirely rather than written once per crew per sweep — this killed the recurring “Memory health advisory” flood.
  • Secret redaction. inbox.RedactSecrets masks credential material (connection strings, password=…, bearer tokens, high-entropy blobs) before it reaches body_md, so an agent escalation that quotes a generated redis://:…@… URL never broadcasts the secret to every MANAGER. CREDENTIAL-type escalations additionally lead with an explicit “secret handled in the credential flow” note. The source-of-truth row (escalations / credentials table) still holds the real value behind its own ACL; the UI also masks any remaining secret-looking Context values behind a reveal toggle.

Grouping & bulk triage

Once the list gets long, flat newest-first ordering stops scaling. The list can be grouped by Type, Routine, Issue, or Crew — the group key comes off kind plus the relevant payload field (pipeline_slug for Routine, issue_identifier for Issue, crew_id for Crew). A row whose field is missing lands in a stable “No routine” / “No issue” / “No crew” bucket rather than disappearing. Grouping is client-side — it re-buckets the rows already loaded, no extra fetch. Each group is a collapsible folder with a count in its header. A group-header checkbox selects the whole group (and renders indeterminate when only some rows under it are checked); per-row checkboxes select individuals. When you collapse a group or switch filters, off-screen checked ids are pruned from the selection so the counts in the bulk bar stay honest. With anything selected, a bulk bar appears: Mark read and Resolve apply across the whole selection in one request (POST /api/v1/inbox/bulk) rather than a PATCH per row.
Bulk Resolve never silently closes a decision item. If the selection contains any waitpoint / escalation row or any blocking=true row, a confirmation dialog spells out how many items will be left open and resolves only the dismissable remainder. If the entire selection is decision items, there is no Resolve action at all — open each one individually and act on its source. Mark read is always one click and closes nothing. Even if a client bypasses these guards, the server backstop skips the same rows, so a real decision item can never be resolved by the bulk path. The one exception, applied identically client- and server-side, is a keeper-synthetic escalation (kind=escalation, no backing escalations row): it has no source to defer to, so bulk Resolve clears it instead of skipping it.
A row whose kind is an unknown value renders as a generic notification (hardened fallback) instead of crashing the row — new server-side kinds degrade gracefully in an older UI.

Realtime

The list is refreshed by two WebSocket events on the workspace channel:
EventSourceEffect
inbox.updatedinbox_handler.go after any PATCHRe-fetches the current filter
escalation.createdescalation_handler.go on POSTRe-fetches so a new escalation lights the list instantly
The useInbox hook (hooks/use-inbox.ts) listens to both and re-issues the same workspace-scoped GET on each. There is no dedicated event for waitpoint creation; the bell badge uses useInboxUnreadCount which adds a 30-second poll alongside the same realtime listeners, and that poll is what surfaces a newly-parked waitpoint within the next tick.
The badge debounces correctly across tabs: an approve from your laptop clears the bell on your phone within a few hundred milliseconds because both subscribe to the same inbox.updated workspace event.

CLI

The Inbox has full CLI parity — everything the web surface exposes is scriptable. Same items, same lifecycle, same kind taxonomy. The contract is “anything an operator can do in the UI must be pipe-glueable”, so cron-driven Slack pings or CI gates on unread waitpoints are first-class.
CommandWhat it does
crewship inbox listShow unread items (default). --state all|read|resolved, --kind waitpoint|escalation|failed_run|message, --limit, --format json|yaml|quiet. JSON now includes payload, sender_type, and (for agent senders) avatar_seed + avatar_style.
crewship inbox get <id>Show one item with its full markdown body and the structured Context (payload) — the read counterpart of the web detail pane. --format json for scripting.
crewship inbox read <id>Mark an item as read.
crewship inbox unread <id>Flip back to unread (also the way to restore an archived item).
crewship inbox resolve <id>Mark resolved. Optional --action approved|denied|retried|cancelled|acknowledged|dismissed to record the decision shape.
crewship inbox archive <id>Gmail-style archive — clear a non-decision item from the inbox without a decision (resolve with action=archived). Restore with crewship inbox unread <id>.
Examples:
# List unread items in the current workspace
crewship inbox list

# Include resolved items, narrow to waitpoints
crewship inbox list --state all --kind waitpoint

# Pipe into jq to count by kind
crewship inbox list --format json | jq '[.[] | .kind] | group_by(.) | map({kind: .[0], count: length})'

# Mark something resolved with the action shape that matches what you did
crewship inbox resolve abc123 --action approved

# Push the bell count into Slack every 5 minutes
*/5 * * * * count=$(crewship inbox list --format json | jq length); \
            [ "$count" -gt 0 ] && curl -X POST "$SLACK_HOOK" -d "{\"text\":\"$count inbox items\"}"
crewship inbox resolve only flips the inbox row. It does not call the source endpoint — approving a waitpoint through to the executor requires crewship approvals approve <id>, and resolving an escalation in its own lifecycle requires crewship escalation resolve <id>. The inbox CLI is the read + acknowledge surface; the source CLIs are the act-on-it surface.

Permissions

Inbox items are user-scoped via three target fields on the row:
TargetingWho sees it
target_user_id is setOnly that user
target_role is setEvery member with that role in the workspace
Both emptyEvery member of the workspace (the default for waitpoints and routine messages)
inboxVisibilityClause in internal/api/inbox_handler.go enforces the predicate on every read endpoint (List, UnreadCount, PatchState). Admins do not automatically see other users’ targeted items — least privilege wins over convenience, the same way Linear’s own Inbox works. If you genuinely need a workspace-wide view of who has what pending — say, to audit whether a payroll-grade routine has been sitting on someone’s bell all weekend — use a privileged data path: query the inbox_items table directly (admin-side SQL, bypassing inboxVisibilityClause) or the pipeline_waitpoints table for gate-side state. crewship inbox list --state all is still user-scoped--state only widens the kind filter, it does not lift the ACL — so the CLI alone will not surface items targeted at other users. The journal records pipeline.run.failed + pipeline.step.validation_failed + escalation.created for related signals, but waitpoint creation itself lives on the pipeline_waitpoints row, not in the journal stream.

What this replaces

Before the IA refactor, the same signals were scattered across:
  • A modal-style Approvals queue at /approvals (waitpoints only)
  • Per-mission notifications on /missions/<id> (escalations, message-style nudges)
  • A failed-run banner that lived inside the run detail panel
  • Top-bar bell that only showed the count, not the list
The Inbox folds all four into one feed. /approvals still exists for the workspace-admin “show me every pending waitpoint workspace-wide, including ones not targeted at me” use case, but the daily-driver surface is /inbox.

Approving an ephemeral hire

When an operator (or a LEAD agent via the sidecar /spawn endpoint) runs crewship hire against a guided-autonomy crew, the agent row is created with status='PENDING_REVIEW' instead of IDLE. The chatbridge refuses to start the agent in this state — see internal/chatbridge/bridge.go. A waitpoint-kind inbox row lands at the same time so the operator has one place to approve. The row in the inbox UI:
┌─ Inbox · Pending ─────────────────────────────────────┐
│ ⏳ Hire pending review                                 │
│    incident-responder-a3f8 hired into on-call         │
│    "P1 incident #4582 needs sustained eyes-on"        │
│    [Approve hire]    "To deny, fire the agent from   │
│                       its crew page."                 │
└───────────────────────────────────────────────────────┘
The Approve hire button calls POST /api/v1/agents/{agentId}/approve-hire. The handler does four things atomically:
  1. UPDATE agents SET status='IDLE' WHERE id=? AND status='PENDING_REVIEW' — guarded WHERE so a concurrent approve (zero rows affected) returns 409 rather than silently writing nothing; a soft-deleted/missing agent already 404s at the preceding load
  2. inbox.ResolveBySource — flips this inbox row to resolved so it doesn’t reappear
  3. Journal entry agent.hire_approved with the actor’s user_id
  4. WebSocket broadcast agent.hire_approved so the agent canvas + crew roster repaint instantly
There’s intentionally no Deny button today. To deny a pending hire, the operator opens the crew page and fires the agent (DELETE /api/v1/agents/{id}) — same path as deleting any agent. Two reasons:
  • Avoids a second backend code path with its own audit semantics for the same effect
  • Forces the operator to look at the crew context before rejecting (a “deny” button in the inbox is easy to click mid-coffee; firing requires opening the crew and seeing what’s there)
A symmetric “Deny” button is tracked as a UX follow-up — see gh issue list --label needs-design.

Network-failure handling

The Approve button’s fetch() is wrapped in try/catch. An offline / DNS / CORS preflight failure surfaces a toast (Approve failed: <network error>) instead of silently clearing the busy state — operator knows immediately that the request didn’t land and can retry without re-typing the action. Same pattern is applied to the retry-routine button (wrap("retried")).

Keeper Phase 2 inbox rows

The four F4 evaluators route every DENY or ESCALATE decision to the inbox (PR-C). Each row shows:
  • The evaluator type (Skill Review / Behavior / Memory Health / Negative Learning) in the header
  • The proposed action (“Recommend archiving skill X”, “Block tool_name=shell_exec”, “Auto-consolidate workspace memory”, “Land lesson ‘check env vars before deploy’”)
  • A risk score badge (0-10 colour-coded)
  • An expanded section with the LLM’s full rationale (reason) and the agent / crew context
Operators see the row, decide, and click Approve or Reject:
  • Approve lands the proposed action and writes a journal entry recording the decision
  • Reject discards the proposal and records the dismissal; the evaluator gets the rejection as a prior_lesson hint on its next run for the same context (dup-suppression — see F4.4 negative-learning)
For the full operator workflow on these rows — when to approve, what to investigate, how to override an evaluator — see Keeper Reviews panel which is the dedicated triage surface (the inbox shows individual rows; the panel shows the queue).

Self-learning gated proposals

When an agent has self_learning_enabled=0 (the default), the F4.4 negative-learning evaluator’s ALLOW path and the F6 persona-suggest gate-demoted path BOTH route through the inbox as blocking rows. The marker "self_learning_gate": "off" in the row’s payload_json distinguishes these from policy-routed inbox rows. The row presents the proposed change verbatim:
┌─ Inbox · Pending ─────────────────────────────────────┐
│ 💡 Lesson proposal (gated by self_learning=OFF)        │
│    Anna proposes: "Always check DATABASE_URL is set   │
│                    before deploy.sh"                  │
│    Kind: negative · Source: negative_learning         │
│    Reason: "Recurring deploy-env mistake worth a      │
│             permanent lesson"                         │
│    [Approve lesson]   [Reject]                        │
└───────────────────────────────────────────────────────┘
Approval lands the lesson into the agent’s lessons.md via consolidate.WriteLesson (PR-Z Z.7) — the same writer the auto-apply path uses, with the same schema + idempotency + flock contract. Rejection discards the proposal and records the dismissal so the same lesson doesn’t re-propose on the next failure. Operators who don’t want to gate every proposal can flip self_learning_enabled=1 on the agent via the Agent canvas → Settings → Learning posture panel. See Autonomy + self-learning for the full posture-vs-policy semantics.

What’s next

  • Routines — the workflow recipes whose wait steps create waitpoints
  • Harbormaster — the approval-gate framing for high-risk actions
  • Activity — the live trace view where inline Approve/Deny on a paused step does the same thing as the Inbox’s Approve button
  • Ephemeral agents — the hire/rehire lifecycle that creates the pending-review inbox rows
  • Keeper Reviews panel — the dedicated triage surface for F4 evaluator decisions
  • Autonomy + self-learning — the per-agent flag that gates F4.4 + F6 proposals through this inbox