Documentation Index
Fetch the complete documentation index at: https://docs.crewship.ai/llms.txt
Use this file to discover all available pages before exploring further.
Orchestration
Crewship’s orchestration system manages multi-agent missions through theMissionEngine (internal/orchestrator/mission.go). It handles task scheduling, dependency resolution, failure recovery, and cross-crew coordination.
“Orchestration” here means the engine subsystem in
internal/orchestrator/, not a navigable page. After the Plan/Run/Build/System IA refactor, the user-facing surfaces are split: Routines for reusable recipes, Issues for the work-item tracker, Inbox for your actionable feed, and Activity for the live trace canvas. The legacy /orchestration route now soft-redirects to /activity.Mission Lifecycle
- PLANNING — Mission created, tasks defined (or waiting for Lead to plan)
- IN_PROGRESS — Tasks being scheduled and executed
- REVIEW — All tasks finished (none failed). The mission enters review before final completion, allowing humans to inspect results
- COMPLETED — Mission accepted after review
- FAILED — A task failed and could not recover, or deadlock/timeout detected
- CANCELLED — Manually stopped by user or system
The REVIEW state is inserted between IN_PROGRESS and COMPLETED. When all tasks reach a terminal state (COMPLETED, FAILED, or SKIPPED) and none have failed, the mission transitions to REVIEW rather than directly to COMPLETED. If any task failed, the mission transitions to FAILED instead.
The Mission Engine
TheMissionEngine is the central orchestrator. Key parameters:
| Parameter | Value | Source |
|---|---|---|
| Polling interval | 3 seconds | time.NewTicker(3 * time.Second) |
| Circuit breaker threshold | 3 consecutive failures | circuitBreakerThreshold = 3 |
| Mission timeout | 2 hours | missionTimeoutDefault = 2 * time.Hour |
| Max result summary | 8,000 chars | maxResultSummaryLen = 8000 |
| Max brief total | 32,000 bytes | maxBriefTotalLen = 32000 |
| Per-dependency output truncation | 4,000 chars | maxDepOutputLen = 4000 |
Mission Loop
TherunMissionLoop function runs as a goroutine for each active mission. Every 3 seconds it:
Task States
| State | Description |
|---|---|
PENDING | Ready to be scheduled |
RUNNING | Currently being executed by an agent |
COMPLETED | Finished successfully |
FAILED | Execution failed |
BLOCKED | Waiting for dependent tasks to complete |
AWAITING_APPROVAL | Task completed but held for human review before proceeding |
SKIPPED | Task was intentionally skipped (counts as terminal, does not cause mission failure) |
SKIPPED tasks are treated as terminal alongside COMPLETED and FAILED when checking mission completion. A skipped task does not block downstream dependencies and does not cause mission failure.
Token Budget Calculation
The orchestrator allocates system prompt space using a token budget system defined ininternal/tokenutil:
| Constant | Value | Description |
|---|---|---|
MaxSystemPromptTokens | 32,000 | Total conservative budget for the system prompt |
ConversationBudgetPct | 60% | Percentage of remaining budget for conversation history |
MemoryBudgetPct | 40% | Percentage of remaining budget for agent memory |
Mission Brief Construction
When an agent is dispatched for a mission task, thebuildMissionBrief function constructs a rich context prompt with five sections:
1. IMPORTANT Preamble
Only included when dependency outputs exist. Instructs the agent not to ask clarifying questions:2. [MISSION]
Mission title, goal, and a DAG overview listing all tasks with their status markers:+COMPLETED>IN_PROGRESSxFAILEDPENDING/BLOCKED
3. [INPUT FROM PREVIOUS TASKS]
Outputs from completed dependency tasks, injected before the assignment so agents read context first. When a task produced a structured handoff block, only the handoff summary, artifacts, and confidence are included (more concise). Otherwise the full result summary is included, truncated to 4,000 characters per dependency.4. [YOUR ASSIGNMENT]
The specific task title, description, and iteration number (if this is a retry).5. [OUTPUT FORMAT]
Structured handoff instructions requiring the agent to produce a---HANDOFF--- block with summary, confidence, and artifacts.
The total brief is capped at 32KB (maxBriefTotalLen). If exceeded, the brief is truncated with a note.
Lead Planning Phase
When a mission starts with 0 tasks, the engine dispatches the Lead agent to create a plan. The Lead uses its crew context to understand available agents and creates tasks via the sidecar/mission/create endpoint.
LeadPlanning Flag
TheDispatchRequest includes a LeadPlanning flag that tells the API layer to dispatch the agent as a LEAD with sidecar access. This is essential because Lead agents need access to the mission management API (/mission/create, /mission/{id}) to define tasks, while regular AGENT tasks skip the sidecar for security.
TOCTOU Prevention
A time-of-check-to-time-of-use race is prevented by inserting a sentinelmissionState into the active map before loading the mission from the database. The planningDispatched flag on the mission state prevents re-dispatching the Lead if it is still working. This flag is only set to true after dispatchLeadPlanning succeeds.
Scaling Rules
The Lead agent follows complexity-based scaling rules injected via the system prompt:| Complexity | Agents | Tool Calls | Duration | Tokens |
|---|---|---|---|---|
| SIMPLE | 1 | 3-10 | ~5 min | ~10K |
| MEDIUM | 1-2 | 10-15 | ~15 min | ~50K |
| COMPLEX | 2-4 | 15+ | ~30 min | ~100K |
Workflow Templates
Four built-in workflow templates are defined ininternal/orchestrator/workflow.go:
- Sequential
- Parallel
- Dev-Test Loop
- Pipeline
Tasks execute one after another in order.
The Ralph Loop Pattern
TheLoopController (internal/orchestrator/loop.go) manages task retry logic:
- When a task fails and has
max_iterations > 1, the controller increments the iteration counter and resets the task toPENDING - For loop-back patterns (dev-test-loop), when a downstream task fails, the upstream task is reset to restart the cycle
- Previous failure context from the progress log is injected so the agent learns from mistakes
ShouldRetry method checks if a failed task has remaining iterations. If yes, it resets the task:
- Status back to
PENDING - Iteration counter incremented
- All execution fields cleared (
assignment_id,result_summary,error_message,started_at,completed_at,duration_ms)
RetryLoopBack method handles the upstream reset pattern: when a downstream task (e.g., “test”) fails, it checks the dependency chain. If an upstream task (e.g., “develop”) has remaining iterations, that task is reset to PENDING and the failed downstream task is set to BLOCKED — ready to run again once the upstream completes.
Task Approval Gate
ThecheckApprovalGate function determines whether a completed task should be held for human review. The gate evaluates three inputs:
- Explicit flag — if
approval_required = 1on the task, it is always held - Confidence threshold — the agent’s self-reported confidence from the handoff block
- Escalation config — per-crew configuration with tiered thresholds
Escalation Config
Each crew can define anescalation_config JSON object with three thresholds:
| Threshold | Behavior |
|---|---|
auto_approve_threshold | Confidence at or above this value: auto-approve (task goes to COMPLETED) |
notify_threshold | Confidence below this value: send a confidence.low WebSocket notification |
require_approval_below | Confidence below this value: hold the task in AWAITING_APPROVAL |
- If confidence >=
auto_approve_threshold, return COMPLETED - If
approval_requiredis explicitly set, return AWAITING_APPROVAL - If no config or no confidence data, return COMPLETED
- If confidence <
require_approval_below, return AWAITING_APPROVAL - If confidence <
notify_threshold, send notification but return COMPLETED
Approving or Rejecting Tasks
TheApproveTask method transitions a task from AWAITING_APPROVAL:
- Approved: task moves to COMPLETED, dependent BLOCKED tasks are unblocked
- Rejected: task moves to FAILED, all downstream dependent tasks are recursively failed with reason “upstream task rejected”
userID for the audit trail. The approval status (APPROVED or REJECTED), approver, timestamp, and evaluation notes are persisted on the task.
When a task is held in AWAITING_APPROVAL, the mission engine sends an
approval.required WebSocket message to the workspace so dashboards can display a badge or notification.Circular Dependency Detection
TheValidateDAG method checks all mission tasks for:
- References to nonexistent task IDs — any
depends_onentry that does not match an existing task ID causes validation to fail - Circular dependencies — detected using Kahn’s algorithm (topological sort)
Kahn’s Algorithm
The implementation builds an adjacency list and computes in-degrees for each task:"circular dependency detected: N tasks involved in cycle".
DAG validation runs before the mission loop begins scheduling, preventing tasks from being dispatched into an unresolvable dependency graph.
Deadlock Detection
The mission engine detects deadlocks when all remaining tasks areBLOCKED with no task currently IN_PROGRESS, PENDING, or AWAITING_APPROVAL. The detection logic:
- If any task is PENDING, IN_PROGRESS, or AWAITING_APPROVAL — not deadlocked (progress is still possible)
- COMPLETED, SKIPPED, and FAILED tasks are terminal — they cannot contribute to progress
- If all non-terminal tasks are BLOCKED — deadlock confirmed
- The mission is marked as
FAILED - A
mission_deadlockprogress event is emitted - All
AWAITING_APPROVALtasks are failed with “mission timed out”
Circuit Breaker
The circuit breaker tracks consecutive failures per agent. After 3 consecutive failures (circuitBreakerThreshold), the agent is considered unhealthy and tasks are not dispatched to it.
CooldownManager
TheCooldownManager (internal/orchestrator/failover.go) handles rate limit detection and credential cooldown. When an agent run fails due to a rate limit, the associated credential is placed in a cooldown period to avoid hammering the provider.
Rate Limit Detection
TheIsRateLimitError function checks stderr output against known patterns:
| Pattern | Example |
|---|---|
rate limit | ”Rate limit exceeded” |
rate_limit | ”rate_limit_error” |
429 | ”HTTP 429” |
too many requests | ”Too many requests” |
quota exceeded | ”Quota exceeded for model” |
insufficient_quota | ”insufficient_quota” |
billing_hard_limit | ”billing_hard_limit_reached” |
Cooldown Behavior
When a rate limit is detected:MarkCooldown(credentialID, 5*time.Minute)places the credential in a 5-minute cooldownIsInCooldown(credentialID)returns true during this period, causing the orchestrator to skip that credentialClearExpired()removes stale entries
Progress Logging
TheProgressWriter (internal/orchestrator/progress.go) appends structured JSONL events to a per-mission progress file at data/crews/{crewSlug}/missions/{traceID}/progress.jsonl.
Event Types
| Event | Fields | When |
|---|---|---|
mission_started | mission_id | Mission loop begins |
task_started | task_id, agent, title | Task dispatched to agent |
task_COMPLETED | task_id, agent, summary | Task finished successfully |
task_FAILED | task_id, agent, error | Task execution failed |
task_retry | task_id, agent | LoopController resets a task for retry |
mission_deadlock | mission_id | All tasks BLOCKED with no progress |
mission_REVIEW | mission_id | All tasks terminal, mission entering review |
mission_timeout | mission_id | Mission exceeded 2-hour timeout |
BuildProgressContext method formats the JSONL into a human-readable text block suitable for injection into an agent’s system prompt.
Structured Handoff
Agents produce structured handoff data at the end of tasks:parseHandoff function extracts this structure from agent output. Both summary and confidence are required for a valid handoff — partial blocks are treated as unparsed.
The confidence value (low, medium, high) feeds into the approval gate. When parsed as a float (via escalation config), it determines whether the task auto-approves or requires human review.
Cross-Crew Missions
Mission tasks can reference agents from connected crews. The system auto-routes assignments to the correct crew container. Crew connections must be established by workspace admins before use.Crew-to-crew handoff with critique exchange (e.g. backend crew hands a draft to a testing crew for review) is on the v0.2 roadmap.
Sidecar API for Orchestration
Lead agents interact with the orchestration system through the sidecar proxy atlocalhost:9119:
| Endpoint | Method | Description |
|---|---|---|
/assign | POST | Assign a task to a crew member |
/results/{id} | GET | Poll for assignment result |
/query | POST | Ask a crew member a quick question |
/standup | GET | Get crew standup summary |
/escalate | POST | Escalate an issue to humans |
/mission/create | POST | Create a multi-task mission |
/mission/{id} | GET | Check mission status |
/mission/{id}/start | POST | Start a mission |
/mission/templates | GET | List available workflow templates |
What’s Next
- Keeper — persistent agent memory across sessions
- Scheduling — cron-based automated agent runs