Skip to main content

Chat & Sessions

Overview

Chatting with an agent is a first-class surface in Crewship. PRs #213, #214, #223, and #225 re-shaped it from a side drawer into a full-page experience with session history, attachments, message reactions, and a right-panel that exposes the agent’s working files, artefacts, and a live terminal. The split-screen layout from PR #314 finished the job — chat on the left, the agent’s workspace surfaces on the right, both live and both deep-linkable. A “session” in Crewship is intentionally durable. It’s a chats row scoped to one agent, with a stable ID that survives page reloads, browser restarts, and operator hand-off — anyone with read access to the agent can resume the same conversation history. The schema has grown alongside the UI: PR #225 added chat_branches (edit-and-resend trees), message_reactions, chat_attachments, and workspace_files, plus chats.origin (migration v59) for grouping in the Sessions sidebar by where the conversation started — one of UI / CLI / WEBHOOK / CRON / AGENT, written by POST /api/v1/agents/{agentId}/chats. The point of the breadth: a chat is more than text — it’s a typed transcript backed by structured tables so search, audit, and replay all work the same way no matter which surface produced the message. This guide covers the model, the URL state, and the surfaces a power user (or integrator) cares about. The component reference lives in the frontend codebase (app/(chat)/chat/) and the persistence layer lives in internal/api/agent_chats.go + the Go migrations under internal/database/migrate.go; here we focus on what the platform persists and how the URL state composes with it.

When to use it

Most chat surface usage is self-evident — “talk to the agent” — but the durable-session design unlocks several flows worth reaching for explicitly:
  • Resume a multi-day conversation. A session ID survives reloads and operator hand-off, so /chat/<agent>?session=<chatId> is the canonical “where we left off” URL. Bookmark it, paste it into Linear, share it on Slack — the recipient lands on the same transcript, the same right-panel state, and (with &panel=files) the same artefact tab.
  • Reroll a bad agent response without abandoning context. Hover-edit the user message and the conversation truncates from that point; the new edit sends as a fresh turn. The transcript before the edit is preserved as a chat branch so the alternate timeline isn’t lost — useful when you discover the original direction was right after all.
  • Triage what an agent has been doing. The Sessions sidebar groups by chats.origin (UI / CLI / WEBHOOK / CRON / AGENT) so you can see at a glance which conversations were started by the user vs by automation vs by a peer agent. Filter the sidebar to WEBHOOK-origin sessions if you want to audit what an integration has been driving through your agents.
  • Watch an agent work in real time without opening the journal. The right-panel Files, Artifacts, Terminal, and Diff tabs stream the live state of the agent’s container while the chat is open. Faster than tailing logs, and the URL deep-links to a specific tab via &panel=files / &panel=terminal.
  • Programmatic chat creation from CLI / webhook / cron. POST /api/v1/agents/{agentId}/chats accepts an origin enum on the body and renders a coloured chip in the UI sidebar. Use CLI for crewship run-style invocations, WEBHOOK for trigger-handlers, CRON for scheduled routines, AGENT for peer-to-peer assignments — the chip helps a human operator scrolling the sidebar understand provenance at a glance.
For one-off “ask the agent a question” use, just open /chat/<agent> and start typing — the session row gets created automatically on first send.

Key concepts

TermWhat it means here
SessionOne row in chats, scoped to one agent. Stable ID, persists across reloads and operators. Anyone with read access to the agent can resume the same transcript. Created lazily on first send (or explicitly via POST /api/v1/agents/{agentId}/chats).
chats.originA NULL-able enum stamped at create time: UI, CLI, WEBHOOK, CRON, or AGENT. Anything else is stored as NULL so a rogue caller cannot inject arbitrary text into a UI chip. Migration v59.
Chat branchA divergence in the conversation tree produced by edit-and-resend. The truncated tail isn’t deleted — it lives in chat_branches (migration v57) so the alternate timeline stays retrievable. The active branch is the one currently rendered.
Slash paletteServer-registered command set surfaced by typing / at start-of-line or Ctrl+K anywhere. Commands like /eval, /checkpoint, /skill are parsed as structured commands (not chat messages) and route to dedicated handlers.
Model pickerPer-turn override of the agent’s llm_model. Lives next to the send button. The override applies only to the current send — the agent’s default model is unchanged for subsequent turns.
AttachmentA file uploaded into a chat — image, document, log, anything ≤25 MB. Stored on the server’s storage provider with a chat_attachments row pointing at it (migration v57). The agent’s container sees the file under /output/<agentSlug>/attachments/<chatId>/<filename>.
ReactionA workspace-shared emoji vote on a single assistant message. Persisted in message_reactions (migration v57) with UNIQUE(chat, message, emoji, user) so POST is idempotent. Aggregated counts ({emoji, count, mine}) returned by List.
Right-panel tabOne of Files (F1), Artifacts (F2), Diff (F3), or Terminal (F4). Each tab streams independently; switching tabs doesn’t retrigger inactive streams. URL state &panel=… deep-links to a tab.
Shallow router stateBrowser-only URL params written via window.history.replaceState (helper: useShallowSearchParam). Used for which panel is open, which message is being edited, slash-palette state — none of which justify a full Next.js navigation but all of which should appear in copy-pasted URLs.
request_idA client-generated UUID stamped on every send. The server idempotently dedupes by request_id so a retry after a WS reconnect never duplicates the user turn. Surfaced on outbound messages so clients can correlate sent-vs-acknowledged.
Presence chipThe small status indicator in the composer footer — online / thinking / blocked / offline. Driven by the Watch Roster; not a guess from idle timers.
Pinned sessionA session with a per-user pin flag set. Pinned sessions float to the top of the sidebar regardless of last-message timestamp. Useful for “the conversation I always come back to”.
workspace_filesThe durable, workspace-scoped blob index (migration v57). The Files right-panel reads from this view; mid-edit files live there until the agent’s next commit.

Usage

The end-to-end loop from “I have an agent” to “we had a productive conversation about it” is five steps. Each step has a dedicated section below with the full surface area; this is the orientation walk-through.
1

Open the chat

https://crewship.example.com/chat/viktor
Drops you into Viktor’s most recent session. To start fresh, click New session in the sidebar. To jump to a specific past conversation, paste the deep-link URL ?session=<chatId>. To approach from the canvas instead, use /crews?crew=<slug>&agent=viktor — see Routes.
2

Send a message

Type into the composer and hit Enter. The agent’s response streams back via WebSocket; the presence chip in the footer shows thinking while the model is generating. Useful composer tricks (full detail in Composer):
  • / to open the slash palette for structured commands (/eval, /checkpoint, /skill <name>).
  • Drag, paste, or click-upload an image / file — capped at 25 MB, surfaced to the agent under /output/<agentSlug>/attachments/….
  • Pick a different model from the dropdown next to Send for this turn only (handy for switching to Opus mid-conversation for a hard step).
3

Watch the agent work

While the agent is acting, flip the right panel to the relevant tab — see Right panel for the full tab map:
  • F1 / Files — live working tree under /workspace.
  • F2 / Artifacts — generated reports, screenshots, downloads.
  • F3 / Diff — live diff vs. the workspace’s last commit, while the agent is mid-edit.
  • F4 / Terminal — read-only tmux attach (OWNER/ADMIN can flip to read-write).
Each tab is its own URL query param (&panel=files, etc.), so you can share the exact view you’re staring at.
4

React or reroll

Hover an assistant message to surface the reactions strip (👍 👎 ❤️ 🤔 🚀). Reactions are workspace-shared and surface back to the agent on the next turn so it can adjust. Hover a user message instead to surface Edit — editing rewrites that turn and truncates the conversation, but the old branch lives on in chat_branches and can be revisited later via the branch picker. See Edit-and-resend.
5

Resume later

Just copy the URL. The session ID is in ?session=<chatId>, the right-panel state is in &panel=…. Bookmark, paste into Linear, share over Slack — anyone with read access on the agent lands on exactly the same view (their reactions are scoped to them, the session itself is shared). Pin the session in the sidebar if it’s a long-running thread you want to surface above the timestamp-sorted list.
For non-UI surfaces — CLI-initiated chats, webhook-driven chats, scheduled chats — see the API reference below; the origin field on the create payload stamps the right sidebar chip so an operator scrolling the list can tell at a glance which conversations came from automation.

Examples

Pair-debugging a failing test

A test is red in CI and you want the agent’s eyes on it. Open the agent and start a new session:
https://crewship.example.com/chat/viktor
Drag the failing test’s snapshot file into the composer — it lands at /output/viktor/attachments/<chatId>/snapshot.txt inside the container; the agent opens it via the Files tab (F1). As the agent edits the test, flip to Diff (F3) to watch the patch take shape against HEAD; when it wants to run the suite, switch to Terminal (F4) and watch the output stream live. Each tab is its own URL param (&panel=files / &panel=diff / &panel=terminal), so the link you share with a teammate drops them onto the exact view you’re looking at. When the suite goes green, 👍 the assistant’s final message — the reaction surfaces back to the agent on the next turn as positive signal and shows up in Quartermaster replays as coarse human feedback.

Webhook-driven issue triage

A GitHub issue webhook fires; your trigger handler posts to Crewship:
curl -X POST https://crewship.example.com/api/v1/agents/triage-bot/chats \
  -H "Authorization: Bearer crewship_cli_xxxxx" \
  -d '{"origin": "WEBHOOK"}'
# {"id":"cht_xyz789"}
The create call only stamps the session shell (session_id?, origin?); the opening message is sent as a follow-up turn on the returned chat id. The new session shows up in the agent’s sidebar with a coloured WEBHOOK chip, so an operator scrolling the list can distinguish auto-triaged issues from human-initiated conversations at a glance. The agent processes the message, does its triage, posts a summary — and a human can take over the conversation by opening ?session=cht_xyz789 and replying directly.

Cross-operator escalation

You’re chatting with viktor about a production outage and realise this needs an SRE’s eyes. Pin the session in the sidebar so you can come back to it, copy the URL with the right panel set to Terminal:
/chat/viktor?session=cht_abc123&panel=terminal
Paste into the SRE channel on Slack with one line: “Live terminal on the agent’s repro — viktor’s pinned this for me.” The SRE clicks, lands on the same chat session with the same active right-panel tab (terminal) and the same live exec stream — though their own reactions are scoped to them. They reply in the chat; their messages join the same transcript, visible to both of you and to anyone else who opens the URL. When the incident closes, the session row survives — it stays as a permanent record of what was said, what was tried, and what fixed it. The webhook-triggered originals from step 2 and the human-driven escalation here live in the same chats table, distinguishable only by their origin chip in the sidebar.

Long conversations & context compaction

A session is durable on disk, but each agent turn only sees as much of the transcript as fits the model’s context window. The orchestrator (internal/orchestrator/orchestrator_run_conv.go) sizes a conversation budget and fills it newest-first, so the most recent turns are always present in full. When a conversation outgrows that budget, the older turns are compacted rather than dropped. If an auxiliary model is configured (the same summarizer slot Crewship uses for memory consolidation — KEEPER_OLLAMA_URL + KEEPER_MODEL), the overflow is summarized into a short block prepended to the verbatim recent window:
[EARLIER CONVERSATION — SUMMARY of older messages no longer shown in full]
…decisions made, facts established, completed work, open threads…
[END EARLIER CONVERSATION]
[CONVERSATION HISTORY - previous messages in this session]
…recent turns, verbatim…
A fixed slice of the budget (15%) funds the summary so it never crowds out the recent turns. The summary is regenerated per turn and lives only in that turn’s prompt — it is mid-conversation content, never part of the cached system-prompt prefix, so it doesn’t perturb prompt caching. If no auxiliary model is configured, the summarize call fails, or it exceeds a bounded timeout, the orchestrator falls back to plain newest-first truncation — the historical behavior — so a misconfigured or slow aux model can never stall or fail a run. In that mode the oldest turns drop with a …(truncated) marker on the boundary message, exactly as before. Whenever older turns overflow the budget — whether they were summarized or dropped — the orchestrator emits a conversation.compacted entry to the Crew Journal carrying session_id, overflow_messages, summarized, and summary_bytes. Query it with crewship journal --type conversation.compacted to audit what fell out of an agent’s context window over time.

Temporally-anchored summaries

The summary is written as dated past-tense facts, not open instructions. The aux model is told today’s date (UTC) and asked to rewrite completed or imperative actions accordingly — “Sent the report on 2026-06-09” rather than a bare “email the report” that a resumed agent could mistake for a still-pending task and re-run. When a date can be recovered from the transcript itself the model prefers it; otherwise it anchors to “around <today>”. This keeps long-running sessions from silently re-issuing work that already completed before the older turns scrolled out of the window.

Robustness at the overflow boundary

Two guards keep compaction safe on long, tool-heavy sessions:
  • Tool context is never half-shown. When the message that sits on the overflow boundary carries a tool-result summary, it is dropped into the overflow whole (where the summarizer can still capture it) rather than truncated into a half-sentence that references a tool whose result was cut. The one exception is a lone oversized message — it is still truncated so the recent window is never empty.
  • Summaries are session-pure. Compaction operates strictly on one session’s history; if a future change ever mixed turns from different sessions into one overflow slice, the orchestrator detects the mismatch and falls back to truncation instead of risking one session’s content leaking into another’s summary.
The leading bytes of an agent’s system prompt are unaffected by any of this — the summary block is injected after the cached system-prompt prefix, so repeated compaction never perturbs prompt caching.

API reference

The chat surface is small — no dedicated /api-reference/chats page yet; the handler source is internal/api/agent_chats.go. Every endpoint requires Authorization: Bearer <CLI token> or a valid session cookie, and access is gated by the caller’s read/write role on the parent agent.

Sessions (chats)

MethodPathPurpose
POST/api/v1/agents/{agentId}/chatsCreate a session. Body: {session_id?, origin?}. origin is one of UI / CLI / WEBHOOK / CRON / AGENT — anything else is stored as NULL. Returns {"id": "<chatId>"}.
GET/api/v1/agents/{agentId}/chatsList the agent’s sessions, ordered by most recent activity. Each row carries last_activity_at (bumped on every message append) and unread_count — messages the calling user hasn’t read, their own excluded.
PUT/api/v1/agents/{agentId}/chats/{chatId}/readMark the session read for the calling user: advances their per-chat read cursor (unread_count drops to 0) and clears the paired “agent replied” inbox notification. Returns {chat_id, last_read_at}. Cross-tenant or mis-nested ids return 404.

Messages

MethodPathPurpose
GET/api/v1/chats/{chatId}/messagesPaginated message history (handler ProxyHandler.ChatMessages).

Reactions

MethodPathPurpose
GET/api/v1/chats/{chatId}/messages/{messageId}/reactionsAggregated counts: [{emoji, count, mine}].
POST/api/v1/chats/{chatId}/messages/{messageId}/reactionsBody: {emoji}. Idempotent under UNIQUE(chat, message, emoji, user).
DELETE/api/v1/chats/{chatId}/messages/{messageId}/reactions/{emoji}Emoji is a path segment.

Attachments

MethodPathPurpose
POST/api/v1/agents/{agentId}/chats/{chatId}/attachmentsUpload a file (multipart, ≤25 MB). Returns the file’s storage URL + the chat_attachments row ID. Handler: ProxyHandler.AgentChatAttachment in internal/api/proxy_attachments.go.

Participants

MethodPathPurpose
GET/api/v1/chats/{chatId}/participantsList participants: {participants:[{user_id, email, full_name, role, joined_at}]}.
POST/api/v1/chats/{chatId}/participantsBody {user_id, role?} (role = member | owner, default member). role is an informational label only — it does not grant manage-participants rights (see below). Promotes the chat to visibility='group' and seeds the creator as an owner row. Returns 204.
DELETE/api/v1/chats/{chatId}/participants/{userId}Remove a participant. Returns 204.
Access and tenancy rules:
  • Scoped via chats.workspace_id — a cross-tenant or unknown chat returns 404.
  • Mutations (POST / DELETE) require the chat creator or a workspace OWNER/ADMIN; otherwise 403 only the chat owner or a workspace admin can manage participants. A participant whose role is owner is not thereby granted this right — manage-participants permission comes only from being the chat creator or a workspace admin, independent of the per-participant role label.
  • Plain members can GET the participant list but cannot mutate it.
  • The target user must already be a workspace member, else 400 user is not a member of this workspace.
  • Removing the chat owner is rejected with 400 cannot remove the chat owner.

Mid-turn steering

Sometimes you want to redirect an agent that is already mid-turn — “actually, focus on the auth bug first” — without waiting for the current run to finish or starting a competing second run in the same container.
MethodPathPurpose
POST/api/v1/chats/{chatId}/steerQueue a steering message. Body: {message}. Returns {queued, in_flight}. Scoped via chats.workspace_id, so cross-tenant returns 404.
The CLI parity is crewship chat steer:
# Redirect a chat — applied on the next turn
crewship chat steer c_abc123 --message "focus on the auth bug first"
crewship chat steer c_abc123 -m "use the staging DB, not prod"

# Scriptable
crewship chat steer c_abc123 -m "tighten the scope" --format json
# → {"queued":true,"in_flight":true}
Today’s behavior is queue-for-next-turn, not live injection. When a run is already in flight for the chat, the bridge detects it (an in-flight guard tracks live runs per chat) and queues the steering message rather than spawning a second run that would race the first into the same container. The queued message is persisted to conversation history tagged as a steer, and a steering_queued event is emitted on the chat’s session:<chatID> WebSocket channel; the agent picks it up on the next turn like any other user message.in_flight in the response tells you whether a run was live when you steered (true → it lands after the current turn finishes; false → it lands at the start of the next turn you kick off). The steering text is run through the same content scan as memory and tool output, so a prompt-injection payload is rejected with 422 and nothing is persisted.Live injection into a running turn — interrupting the model’s current generation to feed it the new instruction immediately — is a planned follow-up, not in this release.

Realtime (WebSocket)

The chat page subscribes to the per-session WebSocket channel (session:<chatID>) and listens for the events the runtime actually emits. There are no chat.message.* or chat.reaction.* events today — message turns are read back via REST after a successful POST, and reactions are local-state-only on the client.
EventChannelWhen it fires
assignment_createdsession:<chatID>The agent kicks off a new assignment from a user turn.
assignment_runningsession:<chatID>The assignment transitions into execution.
assignment_completed / assignment_failedsession:<chatID>Terminal state — the agent’s reply (or the failure) is ready to fetch.
peer_query_runningsession:<chatID>A peer-query sub-task started inside this chat (multi-agent dispatch).
escalation_created / escalation_resolvedsession:<chatID>The agent escalated to a human, or the escalation was decided.
port_expose_createdsession:<chatID>The agent opened a container port and the proxy URL is ready.
steering_queuedsession:<chatID>A mid-turn steering message was queued (see Mid-turn steering); the agent applies it on the next turn.
Each event triggers a REST re-fetch of the affected resource — the server-side state is authoritative and the next render is a function of that, not of imperative local mutations. Reconnect is idempotent: sends in-flight at disconnect time carry the same client request_id and the server dedupes.

Routes

Two top-level routes serve chat:
RoutePurpose
/crews?crew=<slug>&agent=<slug>Selection-driven canvas. Click an agent in the canvas to open a chat panel inline.
/chat/<agentSlug>?session=<chatId>Full-page chat — the same agent, the same session model, the whole viewport for messages + composer + right panel.
The /chat/<agentSlug> URL is deep-linkable — the session query parameter scopes to a specific chat history. Without ?session, the route opens the agent’s most recent chat. With a session ID, it scrolls to that chat’s history and resumes the conversation in place. The state of which tab is open in the right panel, which message is being edited, whether the slash palette is open, and the like is held in shallow router state — useShallowSearchParam writes to window.history.replaceState directly without triggering a Next.js navigation. This keeps the page interactive (no rerender on tab switch) while still giving you a copyable URL for any state.

Sessions

A “session” is a chats row. Sessions persist across page reloads, browser sessions, and operators (any user with read access to the agent can see its history). Migration v59 added the chats.origin column (DB-side; the Prisma TypeScript schema is intentionally not regenerated for this column — Prisma is types-only in this project). The Go handler at POST /api/v1/agents/{agentId}/chats accepts an origin field on the body and whitelists these values:
origin valueWhen written
UICreated via the chat page in the browser.
CLICreated via crewship run or other programmatic CLI use.
WEBHOOKCreated by a webhook trigger (see Webhooks).
CRONCreated by a scheduled trigger.
AGENTCreated by a peer agent (agent-to-agent assignment).
Anything else (or empty) is stored as NULL so a rogue caller cannot shove arbitrary text into a UI-rendered chip. The Sessions sidebar renders a colored chip per origin — NULL rows show no chip. Sessions can be renamed, archived, or pinned; pinning floats a session to the top of the sidebar regardless of last-message timestamp. Chat sessions honor the workspace preferred_language — the agent replies in it. Set it with crewship workspace update --language <code> (see crewship workspace).

Unread badges & last activity

The Sessions sidebar orders by last activity — a session jumps to the top whenever a message lands in it, regardless of when it was created (chats.last_activity_at, migration v130). Sessions with replies you haven’t seen show a subtle blue unread badge (dot + count). Read state is per user: a chat_read_cursors row records the last moment you marked the chat read, and unread_count is everything appended after that cursor that you didn’t write yourself. Opening a session marks it read automatically; the badge and the inbox notification clear together. From the CLI, the same data drives crewship chat list (an UNREAD column plus LAST ACTIVITY ordering), and crewship chat read is the mark-read parity command:
# Which sessions have replies I haven't seen?
crewship chat list atlas
# ID        TITLE            STATUS  MSGS  UNREAD  LAST ACTIVITY     ORIGIN
# c_abc123  Deploy help      ACTIVE  6     2       2026-07-02 09:41  UI

# Read the transcript, then clear the badge + bell item
crewship chat c_abc123
crewship chat read c_abc123            # agent auto-resolved from the chat
crewship chat read c_abc123 --agent atlas   # skip the lookup scan

“Your agent replied” notifications

Long-running turns shouldn’t require babysitting the tab. When an assistant reply is persisted for a session and a chat user is not currently viewing it (no live WebSocket subscription on session:<chatId>), Crewship writes a message-kind item into the unified inbox: agent name, session title, a scrubbed ~120-character preview, and an Open chat deep link to /chat/<agentSlug>?session=<chatId>. Behavior worth knowing:
  • Presence-aware. Watching the reply stream in live means no notification — the bell only rings for replies you’d otherwise miss (closed tab, navigated away, network drop mid-run).
  • One item per (user, session). Repeated replies refresh the same unread bell item (timestamp + preview) instead of stacking; a new reply after you dismissed the old item re-notifies.
  • Group chats notify every participant except the person whose message triggered the run.
  • Previews are scrubbed through the same credential scrubber as logs, so a reply that quotes a token never leaks it into the inbox.
  • Opening the session (UI) or crewship chat read clears the item along with the badge.

Faithful reload

Reopening a session now rehydrates the full structured turn — assistant text, reasoning/thinking, and tool call cards (input + result) — not just the final text. Previously thinking and tool activity were dropped on reload and only the last assistant message survived. Persistence is built on a normalized Part model over the adapter-neutral event stream, so the rehydrated transcript is identical across every CLI adapter (the one in use today and any added later); legacy JSONL transcripts remain readable. MVP limit: tool_result bodies are stored as a correlation id only.

How assistant turns render

  • Streamed text reveals at a smooth, constant rate. Network chunks arrive in bursts; the renderer buffers them and reveals characters at a steady pace that speeds up when a backlog builds (e.g. after a reconnect replay), so replies read as a calm flow instead of sentence-sized pops.
  • All reasoning of a turn renders as ONE chain-of-thought block. Models interleave thinking ↔ text ↔ tools freely (fast models produce a dozen short passes per reply); every pass merges into a single collapsible block at the position of the first one. While the model reasons, the header shows a live shimmering “Thinking… Ns” timer that accumulates across passes; when the turn finishes it collapses to “Thought for N seconds” (total). Transient progress lines never split the block — a streamed turn renders identically to the same turn reloaded from history.
  • Your question pins to the top on send. After you send a message, the conversation anchors your message at the top of the viewport and the reply streams in below it; once the reply outgrows the screen, normal follow-the-stream scrolling (with the scroll-to-bottom pill) resumes.
  • Consecutive tool calls collapse into one “Worked · N steps” disclosure with per-tool input/output. A single tool stays a plain card. Interactive tools (AskUserQuestion, TodoWrite, Task) are never grouped.
  • Bursts of internal progress / thinking chatter collapse into a single quiet live status line rather than scrolling the transcript.
  • Subagent output is indented under its delegation. Text, thinking, and tool calls from a delegation (via the Task tool) are tagged with parent_tool_use_id and rendered indented beneath the delegation in the transcript.

Composer

The bottom-of-page composer is a single textarea with three superpowers:
  • Slash palette — typing / at the start of a line opens a fuzzy-search palette over registered slash commands (e.g. /eval, /checkpoint, /skill <name>). Commands are registered server-side so a multi-line /eval my-suite --baseline last-week is parsed as a structured command rather than a chat message.
  • Model picker — dropdown next to the send button overrides the agent’s default llm_model for this turn only. Useful when an operator wants to spend Opus on a hard reasoning step inside an agent that normally runs Sonnet.
  • Attachments — paste images, drag files, or paste image data from the clipboard. Files are uploaded via POST /api/v1/agents/{agentId}/chats/{chatId}/attachments (handler ProxyHandler.AgentChatAttachment in internal/api/proxy_attachments.go) and stored on the server’s storage provider with a chat_attachments row pointing at them (migration v57). The agent’s container sees the file under /output/<agentSlug>/attachments/<chatId>/<filename>. 25 MB cap per upload.
Enter sends; Shift-Enter inserts a newline. Ctrl+K opens the palette unconditionally.

Edit-and-resend

Hovering over a user message reveals an “Edit” button. Editing the message rewrites that turn and truncates the conversation — every message after it is dropped. The new edit is sent as a fresh user turn. This matches the ChatGPT/Claude UX and lets operators reroll an agent’s response without abandoning the whole session.

Message reactions

Each assistant message has a reactions strip (👍 👎 ❤️ 🤔 🚀). Reactions are workspace-shared — if two operators are in the same workspace, they each see the other’s reactions. Migration v57 added the message_reactions table.
MethodPathPurpose
GET/api/v1/chats/{chatId}/messages/{messageId}/reactionsAggregated counts: [{emoji, count, mine}].
POST/api/v1/chats/{chatId}/messages/{messageId}/reactionsBody: {emoji}. Idempotent under UNIQUE(chat, message, emoji, user).
DELETE/api/v1/chats/{chatId}/messages/{messageId}/reactions/{emoji}Emoji is a path segment.
The List response is aggregated (one row per emoji, with count and mine boolean) — not a per-user list. This keeps the FE’s 👍 3 (you) rendering a single fetch. Reactions are surfaced to the agent on the next turn as part of the conversation context: the model sees that a previous response was 👎‘d by an operator and can adjust. They are also surfaced in Quartermaster replays as a coarse human-feedback signal.

Right panel

The right side of the chat surface is a tab strip. Tabs are sticky — switching agents preserves which tab was open, but switching tabs does not retrigger the SSE/WS streams from the inactive tabs.
TabHotkeyPurpose
FilesF1Live view of the agent’s working tree under /workspace. Click a path to preview; double-click to open in the in-page editor.
ArtifactsF2Generated outputs — markdown reports, screenshots, downloads. Pulled from output/ mount.
TerminalF4Live tmux attach to the agent’s session via the sidecar’s interactive exec endpoint. Read-only by default; OWNER/ADMIN can flip to read-write.
DiffF3(When the agent is mid-edit) live diff against the workspace’s last commit.
The terminal tab reads from the sidecar’s exec stream and writes back through the same socket; latency is bounded by the WS round-trip. Closing the tab does not kill the agent’s tmux session — re-opening reattaches.

Presence and reconnect

The composer footer shows a small presence chip — “online”, “thinking”, “blocked”, or “offline” — driven by the Watch Roster. When the WS connection drops, a reconnect banner appears at the top of the chat (“Reconnecting…”), the composer disables sends, and queued user messages are flushed once the socket re-establishes. Network blips do not lose messages. Sends that were in-flight at disconnect time are retried with the same client-generated request_id; the server idempotently dedupes by request_id so a retry never duplicates a turn.

URL state cheat-sheet

URL fragmentMeaning
/chat/viktorMost recent session for viktor.
/chat/viktor?session=cht_a1b2c3Specific session; deep-linkable.
/chat/viktor?session=cht_a1b2c3&panel=filesOpen the Files tab on load.
/chat/viktor?session=cht_a1b2c3&panel=terminalOpen the Terminal tab on load.
/crews?crew=backend&agent=viktorCanvas view, viktor selected.
These query params are stable. Do not link to the Next.js internal route segments; those are subject to refactoring.

Common pitfalls

  • Edit-and-resend truncates the active branch — but the tail isn’t gone. Editing a user message moves every message after it into chat_branches. The UI shows a small “view earlier branches” affordance, but a hurried operator may believe they’ve lost the alternate response. Train the muscle of opening the branch picker before assuming work is unrecoverable.
  • Reactions are workspace-shared, not per-user-private. A 👎 from one operator is visible to every other operator in the same workspace, and is also surfaced to the agent on the next turn. Use reactions as a deliberate signal — not as a private bookmark. If you need a private flag, use the Inbox pin instead.
  • chats.origin silently NULLs unrecognised values. Anything outside the UI / CLI / WEBHOOK / CRON / AGENT allow-list gets stored as NULL to keep the UI chip safe from injection. A webhook handler that posts origin: "github" will succeed (200 OK) but show no chip — the silent failure is by design. Validate the enum on your side before posting.
  • Prisma TypeScript schema is intentionally stale on chats.origin. Prisma is types-only in this project; the v59 migration was applied to SQLite but the prisma/schema.prisma was deliberately not regenerated. TS callers must query origin via the Go-typed API surface, not via Prisma Client.
  • Attachment cap is 25 MB, with different error surfaces by client. Uploads larger than 25 MB return a 413 from ProxyHandler.AgentChatAttachment. The browser composer surfaces this as a toast, while a programmatic uploader (CLI, webhook) sees only the HTTP error — check status codes before assuming the file landed.
  • The Files right-panel shows mid-edit state, not committed state. It reads workspace_files, which tracks the agent’s live working tree. If you want to see what was actually committed by the agent, look at the Diff tab against HEAD, or run git log in the Terminal tab — the Files view is intentionally optimistic.
  • Closing the Terminal tab does NOT kill the agent’s tmux session. Re-opening reattaches to the same shell. This is a feature (long-running commands survive a tab close) but operators sometimes assume “close” means “stop”. To actually stop, kill the process inside the terminal first.
  • request_id must be stable per logical send, not per React render. Generate the UUID once when the user hits Send, not in the render function. A re-render-derived ID lets the server treat a retry as a new turn — defeating the dedupe contract.
  • Slash palette commands must be registered server-side. /foo in the composer where foo isn’t a registered command sends the literal text as a chat message — there’s no client-side “unknown command” error. Commands are served from GET /api/v1/slash-commands (handler internal/api/slash_commands_handler.go) rather than discovered by the UI.
  • Shallow router state evaporates on hard navigation. panel=files survives in-app tab switches and back/forward, but a hard reload that hits a 3xx redirect (e.g. auth expiry → login → bounce) loses everything past ?session=…. Bookmark with intent, not panic.
  • WS reconnect dedupes by request_id, not by content. Two genuinely different sends that collide on a re-used request_id will silently merge into one turn server-side. Use a fresh UUID per logical send.
  • Right-panel tabs don’t unsubscribe on inactive. Switching from Terminal to Files stops the render of the terminal stream but the WS subscription stays live in the background. This is intentional (re-opening the tab is instant) but means a workspace with many open chat tabs holds many parallel sidecar exec streams — heavy for the sidecar.
  • A turn never ends in silence — two explicit error states cover the edge cases. A run that finishes without producing any output (a swallowed safety refusal, prompt-budget pressure, or the agent CLI exiting cleanly with no stdout) shows “The agent returned no output — try again” as an error bubble, and the run is recorded as FAILED. A reply cut off by a hard server stop (OOM/SIGKILL/power loss) gains “The agent’s reply was interrupted by a server restart — try again” on the next boot. Both turns are persisted, so they appear on reload as well as live; use Regenerate (or resend) to retry.

Group chats (multiple humans + the agent)

A chat starts as a private 1:1 conversation between you and the agent. Add a second person and it becomes a group chat — several teammates and the agent in one thread. Turn-taking. In a group chat the agent stays silent unless it is @mentioned by slug, so people can talk among themselves without the bot replying to every line. Mention it (e.g. @riley can you check the logs) and it runs as usual. In a private 1:1 chat every message runs the agent, unchanged. Managing participants (CLI). Adding the first participant flips the chat to group visibility. You can only add people who already belong to the workspace. Only the chat owner or a workspace OWNER/ADMIN can add or remove participants, and the chat owner cannot be removed.
crewship chat participants add <chat-id> <user-id>                # add (promotes to group)
crewship chat participants add <chat-id> <user-id> --role owner   # add as an owner
crewship chat participants list <chat-id>                         # who's in the chat
crewship chat participants remove <chat-id> <user-id>             # remove
The --role flag accepts member (default) or owner; it maps to the role field of the POST /api/v1/chats/{chatId}/participants body, and the same values appear in GET responses. DELETE takes no role. The label is informational — it does not by itself grant manage-participants rights (see Group chats). Attribution & live sync. Each human message is broadcast to the other participants in real time and labelled with its author (avatar + name); your own messages render as before. A Group · N badge in the header shows the participant count. MVP scope / gotchas.
  • The agent runs under the chat’s own credentials, the same for every participant — there’s no per-user data grounding yet. Don’t share a group chat with people who shouldn’t see what the agent can reach.
  • Mention matching is by slug on a word boundary (case-insensitive), so email@riley.com and @rileybot do not mention agent riley.
  • @mention autocomplete is available in the composer (wired in PR #702), so you can pick the agent from a dropdown instead of typing the slug by hand. Only presence / typing indicators remain unshipped; live message sync works today.
  • Adding a participant does not introduce per-message visibility controls — anyone who can see the chat sees the whole transcript.
  • Activity — the live canvas where the same agent’s mission runs show up; the chat is one operator surface, Activity is the other.
  • Watch Roster — the source of truth for the composer presence chip (online / thinking / blocked / offline).
  • Quartermaster — replays consume message reactions as a coarse human-feedback signal (👍 / 👎 surfaces back to evals).
  • Skills — slash-palette commands like /skill <name> route into the skills registry rather than being treated as chat messages.
  • Hooks — server-side handlers that fire on chat.message.created / chat.reaction.added and can drive automation off chat events.
  • Files and output — the workspace_files blob index that the Files right-panel reads from; same view the agent’s container sees.
  • Keeper — credential injection for tools the agent invokes from inside a chat; keeps secrets out of message bodies.
  • User preferences API — composer settings (default model, slash-palette behaviour).
  • Conversations API — message reactions, attachments, crew messaging.
  • Webhooks — programmatic chat-session creation (origin=WEBHOOK).