Documentation Index
Fetch the complete documentation index at: https://docs.crewship.ai/llms.txt
Use this file to discover all available pages before exploring further.
crewship routine
Manage workspace routines — AI-authored, repeatable workflow recipes that any crew can invoke.
crewship routine <subcommand> [flags]
Alias: crewship pipeline (back-compat — every routine X invocation also works as pipeline X)
See Routines guide for the conceptual overview.
Subcommands
| Group | Command | Purpose |
|---|
| Core | list | List workspace routines |
| Core | get <slug> | Show routine detail with full DSL |
| Core | save | Save a new routine from a JSON file |
| Core | run <slug> | Invoke a routine against the live execution tier |
| Core | dry-run <slug> | Preview what a routine would do without invoking agents |
| Core | delete <slug> | Soft-delete a routine |
| Diagnostics | doctor <slug> | Preflight checklist — catches blind alleys before run |
| Diagnostics | bench <slug> | Variance characterisation — N runs, pass-rate + cost/latency stats |
| Runs | runs <slug> | List recent runs from journal |
| Runs | records <slug> | List runs from pipeline_runs projection (v83) |
| Runs | logs <run_id> | Full journal trace for a run (post-mortem) |
| Runs | cancel <run_id> | Cancel an in-flight run |
| Runs | watch <slug> | Stream live run + step events |
| Versions | versions <slug> | List version history |
| Versions | rollback <slug> --to N | Roll back to version N |
| Bundles | export <slug> | Export bundle JSON to stdout |
| Bundles | import [bundle.json] | Import bundle from file or stdin |
| Authoring | validate [file.json] | Validate DSL offline (no server call) |
| Triggers | schedules ... | Cron schedule CRUD |
| Triggers | webhooks ... | Webhook CRUD |
| HITL | waitpoints ... | Approval inbox + decisions |
For batch evaluation across the eval-* routine fleet (matrix sweep, head-to-head tier compare, regression baselines), see the eval CLI.
crewship routine list
List workspace routines, sorted by usage by default.
crewship routine list
crewship routine list --order recent
crewship routine list --order name
| Flag | Default | Description |
|---|
--order | popularity | Sort: popularity (invocation_count DESC), recent (updated_at DESC), name (alpha) |
Output columns: SLUG, INVOC, LAST STATUS, AUTHOR CREW, DESCRIPTION
crewship routine get <slug>
Show full routine detail including the DSL definition.
crewship routine get summarize-text
Output: human header (slug, name, author, invocations) + pretty-printed DSL JSON.
crewship routine save
Save a new routine from a JSON DSL file. Test_run gate runs first; if it fails, save aborts with a structured error pointing at the failing step.
crewship routine save \
--name "email-fetch-summarize" \
--description "Fetch new emails since a date and summarize." \
--definition email-fetch.json \
--author-crew crew_marketing \
--author-agent agent_lead
| Flag | Required | Description |
|---|
--name | ✓ | Human-readable name; slug derived from this |
--description | | One-line summary shown in [AVAILABLE ROUTINES] |
--definition | ✓ | Path to JSON DSL file |
--author-crew | ✓ | crew_id that owns this routine |
--author-agent | | agent_id that authored (optional but recommended) |
--sample-inputs | | JSON inputs the test_run uses |
crewship routine run <slug>
Invoke a saved routine against the live execution tier.
crewship routine run summarize-text --inputs '{"text":"..."}'
crewship routine run pr-review-structured --inputs '{"diff":"..."}' --invoking-crew crew_eng
# Force every agent_run step onto a specific tier (eval / debug)
crewship routine run eval-extract-emails --tier-override fast
crewship routine run eval-extract-emails --tier-override smart
| Flag | Description |
|---|
--inputs | JSON inputs for the run |
--invoking-crew | crew_id recorded as the invoker (cross-crew reuse audit) |
--tier-override | force every agent_run step onto trivial/fast/moderate/smart. Step-level model_override still wins. Empty (default) = use authored complexity. Used by the eval suite to diff weak vs strong worker tiers without DSL changes. |
Output: run_id, status, output, duration_ms, cost_usd.
Common errors
- API error (500):
pipeline: concurrency_key rendered to empty value (referenced input missing or empty) — the DSL declared a concurrency_key template that references an input you didn’t supply, and the template has no literal text to fall back on. See Troubleshooting in the routines guide for the four fixes.
crewship routine doctor <slug>
Preflight checklist that walks every checkable precondition before run and emits a ✓/⚠/✗ report. Catches the blind alleys (missing crew provisioning, agent slug typo, missing credential, contradictory validation gate, cost cap too tight) at one round-trip instead of one failure-per-run.
crewship routine doctor eval-extract-emails
crewship routine doctor my-routine -f json # CI-friendly
Checks performed:
routine_exists — routine present in the workspace
author_crew — author crew exists, has a devcontainer config, provisioning status is completed (not in_progress or failed)
agent_slugs — every step’s agent_slug and every outcomes.grader_agent_slug resolves to an agent in the author crew (runtime resolution is crew-scoped)
credential:<TYPE> — every entry in credentials_required has an active workspace credential of that type
egress_allowlist — egress_targets declared when http steps are present; no wildcards or loopback hosts (*, localhost, 127.0.0.0/8)
cost_cap — max_cost_usd is set and is at least 1.5× estimated_cost_usd
validation:<step_id> — gates are structurally satisfiable (min_length ≤ max_length, no string in both must_contain and must_not_contain)
Exit code is non-zero if any check FAILed. WARN does not propagate to the exit code — it’s an operator hint, not a blocker.
crewship routine bench <slug>
Run a routine N times with the same inputs and report pass-rate, cost stats, and latency distribution. Trigger.dev-style stability characterisation — use to decide whether a routine is production-ready at its current tier.
# 10 runs at the routine's authored tier
crewship routine bench eval-extract-emails --runs 10
# 5 runs with explicit tier override and JSON output for CI
crewship routine bench eval-extract-emails --runs 5 --tier-override fast -f json
# Bench with custom inputs + spaced cooldown for rate-limited credentials
crewship routine bench summarize-text --runs 5 --cooldown-ms 1000 \
--inputs '{"text":"specific text to bench"}'
| Flag | Description |
|---|
--runs | Number of runs (default 10) |
--tier-override | Force every agent_run step onto a tier (trivial/fast/moderate/smart). Empty = authored complexity. |
--inputs | JSON inputs forwarded to every run |
--fail-fast | Abort on the first non-pass instead of completing every run |
--cooldown-ms | Delay between runs (useful when stress-testing rate-limited credentials) |
Output reports total / mean / p95 / max cost; p50 / p95 / max duration; fail-reason breakdown bucketed into cost-cap / rubric-fail / gate-fail / auth-fail / timeout / other. Headline verdict label so the operator doesn’t need to interpret the stats:
| Verdict | When |
|---|
PRODUCTION_READY | ≥90% pass rate |
FLAKY | 70–90% pass rate (investigate top fail reason) |
UNRELIABLE | <70% pass rate (escalate tier or rewrite gates) |
BROKEN | 0% pass rate (auth issue, unsatisfiable gate, or cost guardrail trips every time) |
INSUFFICIENT_DATA | 0 runs |
crewship routine records <slug>
List runs for a routine using the pipeline_runs projection (introduced in migration v83). Backed by indexed columns, so this is O(log n) where runs is O(scan-with-LIKE) over journal_entries. Prefer records for filterable / scriptable run history.
crewship routine records eval-extract-emails
crewship routine records eval-extract-emails --status failed --limit 100
crewship routine records eval-extract-emails --json | jq '.[] | select(.cost_usd > 0.05)'
| Flag | Description |
|---|
--limit | Max records to return, 1–500 (default 50) |
--status | Filter: queued/running/completed/failed/cancelled/interrupted/dry_run |
--json | Emit JSON for scripting |
Falls back with a clear message to crewship routine runs <slug> when the server predates v83.
crewship routine logs <run_id>
Fetch the full journal trace for a single run. Post-mortem and CI-diagnostics surface — every step’s prompt, output, validation verdict, and cost in one document.
crewship routine logs run_cmox10oci00016b48aacd
crewship routine logs run_cmox10oci00016b48aacd --slug eval-extract-emails
crewship routine logs run_cmox... --slug eval-... --json | jq '.events[] | select(.type=="pipeline.step.failed")'
Example output (table mode, with --slug):
TIME EVENT SEVERITY DURATION COST SUMMARY
18:42:01.103 run.started info — — Pipeline pr-review starting…
18:42:01.110 step.started info — — Pipeline pr-review step extract started
18:42:03.421 step.completed info 2.31s $0.0021 Pipeline pr-review step extract completed
18:42:03.428 step.started info — — Pipeline pr-review step summarise started
18:42:05.812 step.completed info 2.38s $0.0013 Pipeline pr-review step summarise completed
18:42:05.820 run.completed info — — Pipeline pr-review completed
Completed in 4.7s · cost $0.0034
The DURATION and COST columns surface the per-step duration_ms / cost_usd from the journal payload (em-dash means “not applicable” — started / failed events don’t carry them). Matches the Runs-tab waterfall in the UI so the same numbers reconcile across surfaces.
| Flag | Default | Description |
|---|
--slug | | Routine slug the run belongs to (optional; enables full journal timeline) |
--json | false | JSON output for scripting |
Pair with routine watch for live observation and routine logs for post-hoc forensics.
crewship routine dry-run <slug>
Preview what the routine WOULD execute without invoking agents. Returns a structured would_execute report with each step’s resolved tier, rendered prompt, and estimated cost.
crewship routine dry-run summarize-text --inputs '{"text":"..."}'
Example output:
Dry run: DRY_RUN_OK (estimated 0ms, $0.0034 total)
Step 1 [extract] (agent_run):
would call agent: viktor
resolved tier: claude/claude-haiku-4-5
estimated cost: $0.0021
rendered prompt:
Extract the key entities from the following text...
Step 2 [summarise] (agent_run):
would call agent: tomas
resolved tier: claude/claude-sonnet-4-6
estimated cost: $0.0013
rendered prompt:
Write a 2-sentence summary of...
Zero side effects — useful for “is this routine wired correctly?” without burning tokens.
The cost figure is an order-of-magnitude estimate (flat token-density heuristic, not real pricing). The same report renders inline in the UI: click Dry run in the routine detail panel and the report appears above the tab bar.
crewship routine validate [file.json]
Offline DSL validation with no server call. Reads from the file argument or stdin if absent.
crewship routine validate routine.json
cat routine.json | crewship routine validate
crewship routine validate routine.json --json # machine-readable
Exit code 0 = valid, 1 = invalid. CI-friendly:
- name: Validate routine DSL
run: crewship routine validate routines/*.json
Runs the same parse + validate + step-id-uniqueness checks the server would, except cross-routine cycle detection (which needs the workspace’s full call graph) and slug uniqueness (which needs the DB).
crewship routine watch <slug>
Stream live run + step events for a routine. Polls every 2s, dedupes events, prints in chronological order with ANSI colour by status.
crewship routine watch summarize-text
crewship routine watch summarize-text --json | jq '.entry_type'
crewship routine watch summarize-text --once # exit on first run terminal
crewship routine watch summarize-text --run-id <id>
| Flag | Default | Description |
|---|
--json | false | JSON Lines instead of coloured text |
--once | false | Exit after the first run completes / fails |
--run-id | | Filter to events for one run_id only |
--interval | 2s | Poll interval (Go duration format) |
Ctrl-C / SIGTERM exit cleanly.
crewship routine versions <slug>
crewship routine versions summarize-text
Output columns: VERSION, HEAD, PARENT, HASH, AUTHOR, CREATED, SUMMARY. The * marker tags the current HEAD.
crewship routine rollback <slug>
crewship routine rollback summarize-text --to 3
| Flag | Required | Description |
|---|
--to | ✓ | Target version number to roll back to |
Creates a new version on top of HEAD whose definition equals v3’s. History is preserved; you can roll forward by another rollback.
crewship routine export <slug>
crewship routine export summarize-text > bundle.json
crewship routine export summarize-text --include-history > bundle-with-versions.json
| Flag | Description |
|---|
--include-history | Include all version chain entries (default: HEAD only) |
Output: crewship-pipeline-bundle/v1 JSON to stdout. Suitable for pipe to routine import or commit to git.
crewship routine import [bundle.json]
crewship routine import bundle.json
cat bundle.json | crewship routine import
crewship routine export src-routine | crewship routine import # cross-workspace copy
Pre-validates JSON locally before sending — typos at the keyboard get a local error rather than a 400 from the server.
crewship routine cancel <run_id>
crewship routine cancel run_cmovn40h30001c57bb667
Signals the run goroutine. Run terminates at the next safe point and emits pipeline.run.failed with reason “cancelled”. Already-terminal runs return 409.
crewship routine schedules
Manage cron-driven triggers. Scheduler runs in-process and ticks every 30s.
schedules list
crewship routine schedules list
crewship routine schedules list --slug summarize-text
crewship routine schedules list --json
Output columns: ID, NAME, ROUTINE, CRON, TZ, ENABLED, NEXT
schedules create
crewship routine schedules create \
--slug summarize-text \
--name "daily-summary" \
--cron "0 9 * * *" \
--timezone "Europe/Prague" \
--inputs '{"text":"daily input"}'
| Flag | Required | Description |
|---|
--slug | ✓ | Target routine slug |
--cron | ✓ | 5-field cron expression |
--name | | Schedule name (default: "<slug> schedule") |
--timezone | | IANA timezone (default: UTC; for host-local, pass an IANA zone such as America/Los_Angeles or read it from cat /etc/timezone / timedatectl show -p Timezone --value) |
--inputs | | JSON object passed as inputs on each tick |
--enabled | | Create enabled (default true) |
schedules update <id>
Pass any subset of --cron / --timezone / --name / --enabled / --inputs to update.
schedules enable <id> / schedules disable <id>
Toggle without altering the cron expression.
schedules now <id>
Force-fire ad-hoc (out-of-cycle invocation for testing).
schedules delete <id>
crewship routine schedules delete sch_abc123 --yes
--yes skips the confirmation prompt.
crewship routine webhooks
Manage event-driven triggers. Each webhook is token-addressed (POST to /api/v1/webhooks/{token}), optionally HMAC-signed for delivery integrity, and rate-limited per token.
webhooks list
crewship routine webhooks list
crewship routine webhooks list --slug pr-review-structured
Output columns: ID, NAME, ROUTINE, HMAC, FIRES, LAST STATUS, RATE/MIN, ENABLED
webhooks create
crewship routine webhooks create \
--slug pr-review-structured \
--name "github-pr-reviews" \
--hmac-secret "$(openssl rand -hex 32)" \
--rate-limit 30
Stripe-style secret reveal: the signing secret is shown ONCE in the response. Save it now or rotate via delete+recreate.
| Flag | Required | Description |
|---|
--slug | ✓ | Target routine slug |
--name | | Webhook name (default: "<slug> webhook") |
--hmac-secret | | HMAC signing secret — empty means no signature verification |
--rate-limit | | Max fires per minute (default 60) |
--inputs-template | | JSON template merged with the request body |
--base-url | | Override the public base URL printed |
webhooks url <id>
Print the public URL without re-revealing the signing secret.
webhooks delete <id>
crewship routine webhooks delete wh_abc123 --yes
Existing senders using the deleted token will start getting 404s.
crewship routine waitpoints
Inspect + decide on pending HITL approval waitpoints.
waitpoints list
crewship routine waitpoints list
crewship routine waitpoints list --json
Output columns: TOKEN, RUN ID, STEP, KIND, CREATED, TIMEOUT, PROMPT
waitpoints show <token>
Full prompt + metadata for a single waitpoint.
waitpoints approve <token> / waitpoints reject <token>
crewship routine waitpoints approve wpt_abc123 --comment "LGTM"
crewship routine waitpoints reject wpt_abc123 --comment "needs revision"
The decision comment is forwarded to the parked run as the wait step’s output, so downstream steps can read approval rationale via {{ steps.<wait_step_id>.output }}.
Global flags
All routine subcommands accept these global flags:
| Flag | Env | Description | | | |
|---|
--server | CREWSHIP_SERVER | Server URL (default http://localhost:8080) | | | |
--workspace | CREWSHIP_WORKSPACE | Workspace ID or slug | | | |
--format | | `table | json | yaml | quiet` |
--no-color | | Disable ANSI colour output | | | |
--verbose | | Verbose request/response logging | | | |
See also