Documentation Index
Fetch the complete documentation index at: https://docs.crewship.ai/llms.txt
Use this file to discover all available pages before exploring further.
Watch Roster
The Watch Roster is the live presence board the UI shows before a lead dispatches work. It tracks whether every agent is reachable, busy, blocked on an approval, or offline. Transitions emitagent.status_change into the Crew Journal so the full history replays alongside every other event.
States
agent_status.status CHECK constraint in migration 52. Changing them requires a migration.
Upsert semantics
- Idempotent on same-status. A second
Upsert(busy -> busy)refreshessincebut does NOT emit a journal entry — the journal is a transition log, not a heartbeat. - Emits on transition.
online -> busy -> blocked -> online -> offlineproduces fouragent.status_changeentries, each with summary"agent <id>: <prev> -> <new>". - MissionID is not persisted on the roster row (a single agent can legitimately be between missions). It IS threaded into the journal entry so the per-mission timeline doesn’t drop transitions.
Sweeper
presence.SweepOffline(ctx, db, j, 5*time.Minute) flips rows whose since is older than the threshold (default 5 min) to offline and emits the transition. The server wires this on a 60s ticker so idle agents don’t linger as “online” forever.
Payload on the timeout emit: {"reason": "idle_timeout"}.
Wiring
The orchestrator tracks presence via thepresenceAdapter in internal/server/orchestrator_adapters.go:
agent_status row — so /crows-nest and /api/v1/presence/roster always returned empty. The adapter now calls presence.Upsert which atomically writes the row and emits the matching journal entry.
Presence updates are best-effort: a DB blip logs a warning but does not abort an agent run.
Read endpoints
GET /api/v1/presence/roster[?crew_id=...]— workspace-scoped list of roster rows. Optional crew filter narrows further.
CLI
--crew expects the crew ID today; slug resolution is TBD. See crewship presence.
Journal entries
agent.status_change— one per real transition. Summary:"agent <id>: <prev> -> <new>"(or"agent <id>: <new>"on first-ever transition).
status, prev, and any details dict (e.g. current_task_id, blocked_reason).
Gotchas
sinceis wall-clock. If the server clock jumps, sweeper decisions can look weird. This is not load-bearing enough to warrant monotonic tracking.- Details are typed loosely.
details map[string]anyis serialised as JSON; callers are responsible for schema. Common fields:current_task_id,blocked_reason,reason. - Offline is eventual. An agent that crashes hard is not flagged offline until the sweeper next ticks (up to 60s) AND 5 min have passed since the last heartbeat. Don’t rely on
offlinefor liveness probes — query the container state directly.
Related
- Crew Journal —
agent.status_change. crewship presence, Presence API.