crewship routine

Manage workspace routines — AI-authored, repeatable workflow recipes that any crew can invoke.

crewship routine <subcommand> [flags]

Alias: crewship pipeline (back-compat — every routine X invocation also works as pipeline X) See Routines guide for the conceptual overview.

Subcommands

Group	Command	Purpose
Core	`list`	List workspace routines
Core	`get <slug>`	Show routine detail with full DSL
Core	`save`	Save a new routine from a JSON file
Core	`run <slug>`	Invoke a routine against the live execution tier
Core	`dry-run <slug>`	Preview what a routine would do without invoking agents
Core	`delete <slug>`	Soft-delete a routine
Diagnostics	`doctor <slug>`	Preflight checklist — catches blind alleys before `run`
Diagnostics	`bench <slug>`	Variance characterisation — N runs, pass-rate + cost/latency stats
Runs	`runs <slug>`	List recent runs from journal
Runs	`records <slug>`	List runs from `pipeline_runs` projection (v83)
Runs	`logs <run_id>`	Full journal trace for a run (post-mortem)
Runs	`cancel <run_id>`	Cancel an in-flight run
Runs	`watch <slug>`	Stream live run + step events
Versions	`versions <slug>`	List version history
Versions	`rollback <slug> --to N`	Roll back to version N
Bundles	`export <slug>`	Export bundle JSON to stdout
Bundles	`import [bundle.json]`	Import bundle from file or stdin
Authoring	`validate [file.json]`	Validate DSL offline (no server call)
Triggers	`schedules ...`	Cron schedule CRUD
Triggers	`webhooks ...`	Webhook CRUD
HITL	`waitpoints ...`	Approval inbox + decisions

For batch evaluation across the eval-* routine fleet (matrix sweep, head-to-head tier compare, regression baselines), see the eval CLI.

`crewship routine list`

List workspace routines, sorted by usage by default.

crewship routine list
crewship routine list --order recent
crewship routine list --order name

Flag	Default	Description
`--order`	`popularity`	Sort: `popularity` (invocation_count DESC), `recent` (updated_at DESC), `name` (alpha)

Output columns: SLUG, INVOC, LAST STATUS, AUTHOR CREW, DESCRIPTION

`crewship routine get <slug>`

Show full routine detail including the DSL definition.

crewship routine get summarize-text

Output: human header (slug, name, author, invocations) + pretty-printed DSL JSON.

`crewship routine save`

Save a new routine from a JSON DSL file. Test_run gate runs first; if it fails, save aborts with a structured error pointing at the failing step.

crewship routine save \
    --name "email-fetch-summarize" \
    --description "Fetch new emails since a date and summarize." \
    --definition email-fetch.json \
    --author-crew crew_marketing \
    --author-agent agent_lead

Flag	Required	Description
`--name`	✓	Human-readable name; slug derived from this
`--description`		One-line summary shown in `[AVAILABLE ROUTINES]`
`--definition`	✓	Path to JSON DSL file
`--author-crew`	✓	crew_id that owns this routine
`--author-agent`		agent_id that authored (optional but recommended)
`--sample-inputs`		JSON inputs the test_run uses

`crewship routine run <slug>`

Invoke a saved routine against the live execution tier.

crewship routine run summarize-text --inputs '{"text":"..."}'
crewship routine run pr-review-structured --inputs '{"diff":"..."}' --invoking-crew crew_eng

# Force every agent_run step onto a specific tier (eval / debug)
crewship routine run eval-extract-emails --tier-override fast
crewship routine run eval-extract-emails --tier-override smart

Flag	Description
`--inputs`	JSON inputs for the run
`--invoking-crew`	crew_id recorded as the invoker (cross-crew reuse audit)
`--tier-override`	force every `agent_run` step onto `trivial`/`fast`/`moderate`/`smart`. Step-level `model_override` still wins. Empty (default) = use authored complexity. Used by the eval suite to diff weak vs strong worker tiers without DSL changes.

Output: run_id, status, output, duration_ms, cost_usd.

Common errors

API error (500): pipeline: concurrency_key rendered to empty value (referenced input missing or empty) — the DSL declared a concurrency_key template that references an input you didn’t supply, and the template has no literal text to fall back on. See Troubleshooting in the routines guide for the four fixes.

`crewship routine doctor <slug>`

Preflight checklist that walks every checkable precondition before run and emits a ✓/⚠/✗ report. Catches the blind alleys (missing crew provisioning, agent slug typo, missing credential, contradictory validation gate, cost cap too tight) at one round-trip instead of one failure-per-run.

crewship routine doctor eval-extract-emails
crewship routine doctor my-routine -f json   # CI-friendly

Checks performed:

routine_exists — routine present in the workspace
author_crew — author crew exists, has a devcontainer config, provisioning status is completed (not in_progress or failed)
agent_slugs — every step’s agent_slug and every outcomes.grader_agent_slug resolves to an agent in the author crew (runtime resolution is crew-scoped)
credential:<TYPE> — every entry in credentials_required has an active workspace credential of that type
egress_allowlist — egress_targets declared when http steps are present; no wildcards or loopback hosts (*, localhost, 127.0.0.0/8)
cost_cap — max_cost_usd is set and is at least 1.5× estimated_cost_usd
validation:<step_id> — gates are structurally satisfiable (min_length ≤ max_length, no string in both must_contain and must_not_contain)

Exit code is non-zero if any check FAILed. WARN does not propagate to the exit code — it’s an operator hint, not a blocker.

`crewship routine bench <slug>`

Run a routine N times with the same inputs and report pass-rate, cost stats, and latency distribution. Trigger.dev-style stability characterisation — use to decide whether a routine is production-ready at its current tier.

# 10 runs at the routine's authored tier
crewship routine bench eval-extract-emails --runs 10

# 5 runs with explicit tier override and JSON output for CI
crewship routine bench eval-extract-emails --runs 5 --tier-override fast -f json

# Bench with custom inputs + spaced cooldown for rate-limited credentials
crewship routine bench summarize-text --runs 5 --cooldown-ms 1000 \
    --inputs '{"text":"specific text to bench"}'

Flag	Description
`--runs`	Number of runs (default 10)
`--tier-override`	Force every `agent_run` step onto a tier (`trivial`/`fast`/`moderate`/`smart`). Empty = authored complexity.
`--inputs`	JSON inputs forwarded to every run
`--fail-fast`	Abort on the first non-pass instead of completing every run
`--cooldown-ms`	Delay between runs (useful when stress-testing rate-limited credentials)

Output reports total / mean / p95 / max cost; p50 / p95 / max duration; fail-reason breakdown bucketed into cost-cap / rubric-fail / gate-fail / auth-fail / timeout / other. Headline verdict label so the operator doesn’t need to interpret the stats:

Verdict	When
`PRODUCTION_READY`	≥90% pass rate
`FLAKY`	70–90% pass rate (investigate top fail reason)
`UNRELIABLE`	<70% pass rate (escalate tier or rewrite gates)
`BROKEN`	0% pass rate (auth issue, unsatisfiable gate, or cost guardrail trips every time)
`INSUFFICIENT_DATA`	0 runs

`crewship routine records <slug>`

List runs for a routine using the pipeline_runs projection (introduced in migration v83). Backed by indexed columns, so this is O(log n) where runs is O(scan-with-LIKE) over journal_entries. Prefer records for filterable / scriptable run history.

crewship routine records eval-extract-emails
crewship routine records eval-extract-emails --status failed --limit 100
crewship routine records eval-extract-emails --json | jq '.[] | select(.cost_usd > 0.05)'

Flag	Description
`--limit`	Max records to return, 1–500 (default 50)
`--status`	Filter: `queued`/`running`/`completed`/`failed`/`cancelled`/`interrupted`/`dry_run`
`--json`	Emit JSON for scripting

Falls back with a clear message to crewship routine runs <slug> when the server predates v83.

`crewship routine logs <run_id>`

Fetch the full journal trace for a single run. Post-mortem and CI-diagnostics surface — every step’s prompt, output, validation verdict, and cost in one document.

crewship routine logs run_cmox10oci00016b48aacd
crewship routine logs run_cmox10oci00016b48aacd --slug eval-extract-emails
crewship routine logs run_cmox... --slug eval-... --json | jq '.events[] | select(.type=="pipeline.step.failed")'

Example output (table mode, with --slug):

TIME          EVENT             SEVERITY  DURATION  COST     SUMMARY
42:01.103  run.started       info      —         —        Pipeline pr-review starting…
42:01.110  step.started      info      —         —        Pipeline pr-review step extract started
42:03.421  step.completed    info      2.31s     $0.0021  Pipeline pr-review step extract completed
42:03.428  step.started      info      —         —        Pipeline pr-review step summarise started
42:05.812  step.completed    info      2.38s     $0.0013  Pipeline pr-review step summarise completed
42:05.820  run.completed     info      —         —        Pipeline pr-review completed

Completed in 4.7s · cost $0.0034

The DURATION and COST columns surface the per-step duration_ms / cost_usd from the journal payload (em-dash means “not applicable” — started / failed events don’t carry them). Matches the Runs-tab waterfall in the UI so the same numbers reconcile across surfaces.

Flag	Default	Description
`--slug`		Routine slug the run belongs to (optional; enables full journal timeline)
`--json`	`false`	JSON output for scripting

Pair with routine watch for live observation and routine logs for post-hoc forensics.

`crewship routine dry-run <slug>`

Preview what the routine WOULD execute without invoking agents. Returns a structured would_execute report with each step’s resolved tier, rendered prompt, and estimated cost.

crewship routine dry-run summarize-text --inputs '{"text":"..."}'

Example output:

Dry run: DRY_RUN_OK (estimated 0ms, $0.0034 total)

Step 1 [extract] (agent_run):
  would call agent: viktor
  resolved tier: claude/claude-haiku-4-5
  estimated cost: $0.0021
  rendered prompt:
    Extract the key entities from the following text...

Step 2 [summarise] (agent_run):
  would call agent: tomas
  resolved tier: claude/claude-sonnet-4-6
  estimated cost: $0.0013
  rendered prompt:
    Write a 2-sentence summary of...

Zero side effects — useful for “is this routine wired correctly?” without burning tokens. The cost figure is an order-of-magnitude estimate (flat token-density heuristic, not real pricing). The same report renders inline in the UI: click Dry run in the routine detail panel and the report appears above the tab bar.

`crewship routine validate [file.json]`

Offline DSL validation with no server call. Reads from the file argument or stdin if absent.

crewship routine validate routine.json
cat routine.json | crewship routine validate
crewship routine validate routine.json --json   # machine-readable

Exit code 0 = valid, 1 = invalid. CI-friendly:

- name: Validate routine DSL
  run: crewship routine validate routines/*.json

Runs the same parse + validate + step-id-uniqueness checks the server would, except cross-routine cycle detection (which needs the workspace’s full call graph) and slug uniqueness (which needs the DB).

`crewship routine watch <slug>`

Stream live run + step events for a routine. Polls every 2s, dedupes events, prints in chronological order with ANSI colour by status.

crewship routine watch summarize-text
crewship routine watch summarize-text --json | jq '.entry_type'
crewship routine watch summarize-text --once     # exit on first run terminal
crewship routine watch summarize-text --run-id <id>

Flag	Default	Description
`--json`	false	JSON Lines instead of coloured text
`--once`	false	Exit after the first run completes / fails
`--run-id`		Filter to events for one run_id only
`--interval`	2s	Poll interval (Go duration format)

Ctrl-C / SIGTERM exit cleanly.

`crewship routine versions <slug>`

crewship routine versions summarize-text

Output columns: VERSION, HEAD, PARENT, HASH, AUTHOR, CREATED, SUMMARY. The * marker tags the current HEAD.

`crewship routine rollback <slug>`

crewship routine rollback summarize-text --to 3

Flag	Required	Description
`--to`	✓	Target version number to roll back to

Creates a new version on top of HEAD whose definition equals v3’s. History is preserved; you can roll forward by another rollback.

`crewship routine export <slug>`

crewship routine export summarize-text > bundle.json
crewship routine export summarize-text --include-history > bundle-with-versions.json

Flag	Description
`--include-history`	Include all version chain entries (default: HEAD only)

Output: crewship-pipeline-bundle/v1 JSON to stdout. Suitable for pipe to routine import or commit to git.

`crewship routine import [bundle.json]`

crewship routine import bundle.json
cat bundle.json | crewship routine import
crewship routine export src-routine | crewship routine import   # cross-workspace copy

Pre-validates JSON locally before sending — typos at the keyboard get a local error rather than a 400 from the server.

`crewship routine cancel <run_id>`

crewship routine cancel run_cmovn40h30001c57bb667

Signals the run goroutine. Run terminates at the next safe point and emits pipeline.run.failed with reason “cancelled”. Already-terminal runs return 409.

`crewship routine schedules`

Manage cron-driven triggers. Scheduler runs in-process and ticks every 30s.

`schedules list`

crewship routine schedules list
crewship routine schedules list --slug summarize-text
crewship routine schedules list --json

Output columns: ID, NAME, ROUTINE, CRON, TZ, ENABLED, NEXT

`schedules create`

crewship routine schedules create \
    --slug summarize-text \
    --name "daily-summary" \
    --cron "0 9 * * *" \
    --timezone "Europe/Prague" \
    --inputs '{"text":"daily input"}'

Flag	Required	Description
`--slug`	✓	Target routine slug
`--cron`	✓	5-field cron expression
`--name`		Schedule name (default: `"<slug> schedule"`)
`--timezone`		IANA timezone (default: `UTC`; for host-local, pass an IANA zone such as `America/Los_Angeles` or read it from `cat /etc/timezone` / `timedatectl show -p Timezone --value`)
`--inputs`		JSON object passed as inputs on each tick
`--enabled`		Create enabled (default true)

`schedules update <id>`

Pass any subset of --cron / --timezone / --name / --enabled / --inputs to update.

`schedules enable <id>` / `schedules disable <id>`

Toggle without altering the cron expression.

`schedules now <id>`

Force-fire ad-hoc (out-of-cycle invocation for testing).

`schedules delete <id>`

crewship routine schedules delete sch_abc123 --yes

--yes skips the confirmation prompt.

`crewship routine webhooks`

Manage event-driven triggers. Each webhook is token-addressed (POST to /api/v1/webhooks/{token}), optionally HMAC-signed for delivery integrity, and rate-limited per token.

`webhooks list`

crewship routine webhooks list
crewship routine webhooks list --slug pr-review-structured

Output columns: ID, NAME, ROUTINE, HMAC, FIRES, LAST STATUS, RATE/MIN, ENABLED

`webhooks create`

crewship routine webhooks create \
    --slug pr-review-structured \
    --name "github-pr-reviews" \
    --hmac-secret "$(openssl rand -hex 32)" \
    --rate-limit 30

Stripe-style secret reveal: the signing secret is shown ONCE in the response. Save it now or rotate via delete+recreate.

Flag	Required	Description
`--slug`	✓	Target routine slug
`--name`		Webhook name (default: `"<slug> webhook"`)
`--hmac-secret`		HMAC signing secret — empty means no signature verification
`--rate-limit`		Max fires per minute (default 60)
`--inputs-template`		JSON template merged with the request body
`--base-url`		Override the public base URL printed

`webhooks url <id>`

Print the public URL without re-revealing the signing secret.

`webhooks delete <id>`

crewship routine webhooks delete wh_abc123 --yes

Existing senders using the deleted token will start getting 404s.

`crewship routine waitpoints`

Inspect + decide on pending HITL approval waitpoints.

`waitpoints list`

crewship routine waitpoints list
crewship routine waitpoints list --json

Output columns: TOKEN, RUN ID, STEP, KIND, CREATED, TIMEOUT, PROMPT

`waitpoints show <token>`

Full prompt + metadata for a single waitpoint.

`waitpoints approve <token>` / `waitpoints reject <token>`

crewship routine waitpoints approve wpt_abc123 --comment "LGTM"
crewship routine waitpoints reject  wpt_abc123 --comment "needs revision"

The decision comment is forwarded to the parked run as the wait step’s output, so downstream steps can read approval rationale via {{ steps.<wait_step_id>.output }}.

Global flags

All routine subcommands accept these global flags:

Flag	Env	Description
`--server`	`CREWSHIP_SERVER`	Server URL (default `http://localhost:8080`)
`--workspace`	`CREWSHIP_WORKSPACE`	Workspace ID or slug
`--format`		`table	json	yaml	quiet`
`--no-color`		Disable ANSI colour output
`--verbose`		Verbose request/response logging

Getting started

Sessions & identity

Agents & conversations

Crews, projects & workspaces

Missions & routines

Issues & triage

Skills & integrations

Inbox & notifications

Memory & journal

Observability

Cost & paymaster

Approvals & hooks

Configuration & system

Utilities

Documentation Index

​crewship routine

​Subcommands

​crewship routine list

​crewship routine get <slug>

​crewship routine save

​crewship routine run <slug>

​Common errors

​crewship routine doctor <slug>

​crewship routine bench <slug>

​crewship routine records <slug>

​crewship routine logs <run_id>

​crewship routine dry-run <slug>

​crewship routine validate [file.json]

​crewship routine watch <slug>

​crewship routine versions <slug>

​crewship routine rollback <slug>

​crewship routine export <slug>

​crewship routine import [bundle.json]

​crewship routine cancel <run_id>

​crewship routine schedules

​schedules list

​schedules create

​schedules update <id>

​schedules enable <id> / schedules disable <id>

​schedules now <id>

​schedules delete <id>

​crewship routine webhooks

​webhooks list

​webhooks create

​webhooks url <id>

​webhooks delete <id>

​crewship routine waitpoints

​waitpoints list

​waitpoints show <token>

​waitpoints approve <token> / waitpoints reject <token>

​Global flags

​See also