> ## Documentation Index
> Fetch the complete documentation index at: https://docs.crewship.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Lookout

> In-process guardrails: prompt injection detection, tool-arg schema validation, structured output parsing, and secrets redaction.

# Lookout

Lookout is the guardrail layer that sits between agent inputs/outputs and the LLM. It is four independent scanners composed into a middleware:

1. **injection** -- heuristic prompt-injection detector (role-override, system-prompt leak, jailbreak tropes, confusable unicode).
2. **args** -- JSON Schema validation of tool-call arguments before they reach the tool implementation.
3. **output** -- structured-output parser that strips markdown fences, validates against schema, and produces a corrective re-prompt.
4. **secrets** -- regex-based secrets redactor for outbound text.

All layers are purely in-process -- no network calls in the default build. An optional Lakera bridge is available for the injection layer but disabled unless explicitly wired.

When a scanner blocks something it emits `guardrail.input_blocked` or `guardrail.output_blocked` into the [Crew Journal](/guides/crew-journal) so the action is auditable. **The matched secret value is NEVER persisted in the entry payload** -- only the finding kind and a stable redacted detail string.

## Verdicts

```go theme={null}
type Verdict string
const (
    VerdictAllow    Verdict = "allow"    // pass through unchanged
    VerdictBlock    Verdict = "block"    // refuse to process
    VerdictSanitize Verdict = "sanitize" // cleaned version available
)
```

Severity maps to journal severity:

| Lookout severity | Journal severity |
| ---------------- | ---------------- |
| `low`            | `info`           |
| `medium`         | `notice`         |
| `high`           | `warn`           |
| `critical`       | `error`          |

## Input guard

Runs on every user/tool-result message before it reaches the LLM:

```go theme={null}
guard := lookout.InputGuard(journalEmitter)
ctx = lookout.WithScope(ctx, lookout.Scope{WorkspaceID: ws, CrewID: crew, AgentID: agent})

_, err := guard(ctx, userMessage)
if lookout.IsBlocked(err) {
    // refuse the call; guardrail.input_blocked already emitted
}
```

Detectors (in `injection.go`):

| Kind                   | Detail                                                                                 |
| ---------------------- | -------------------------------------------------------------------------------------- |
| `role_override`        | Phrases like "ignore previous instructions", "you are now", "disregard system prompt". |
| `system_prompt_leak`   | Attempts to exfiltrate the system prompt.                                              |
| `jailbreak`            | Known jailbreak trope patterns (DAN, hypothetical framings, translations-as-evasion).  |
| `zero_width_unicode`   | ZWJ/ZWNJ/ZWSP sequences.                                                               |
| `rtl_override_unicode` | RTL override codepoints (U+202E, U+2066-2069).                                         |
| `lakera_detected`      | External Lakera Guard verdict when enabled.                                            |

The guard is wired into [`llm.Middleware`](/guides/llm-middleware) so every `Complete()` call is scanned. The `Stream()` path runs through the **same** `lookoutCaller` chain (`internal/llm/middleware.go`), scanning every user/tool message synchronously before the first token flows.

### Per-routine action policy

The default is hard-block on any high/critical finding. Routines whose upstream produces text that occasionally trips the heuristic on benign content (translations, security write-ups, adversarial-prompt research) can opt into a softer mode via DSL:

```yaml theme={null}
name: my-routine
guardrails:
  input:
    prompt_injection:
      action: sanitize    # block (default) | sanitize | log
```

| Action            | Behaviour                                                                         | When to use                                                                      |
| ----------------- | --------------------------------------------------------------------------------- | -------------------------------------------------------------------------------- |
| `block` (default) | Refuses the call; returns `*BlockedError`.                                        | Production routines processing trusted inputs.                                   |
| `sanitize`        | Replaces matched byte ranges with `[REDACTED]`, lets the (defanged) text through. | Noisy upstreams where false positives would block legit traffic.                 |
| `log`             | Passes the original text through unchanged; emits the journal entry only.         | Observability-only mode; great for tuning the heuristic against a real workload. |

The journal entry fires for ALL modes — log mode is observability, not "guard disabled". The `GuardListener` integration hook (below) fires for all modes too.

Sanitize uses **offset-based replacement** (`Finding.Position` + `Finding.MatchEnd`) so long matches and synthetic unicode findings (zero-width, RTL override) are properly redacted. An earlier substring-based implementation silently let those through.

### Integration callback

Wire a callback to a notification target (Slack, PagerDuty, the [hooks](/guides/hooks) subsystem) so guardrail trips don't just land in the journal:

```go theme={null}
ctx = lookout.WithGuardListener(ctx, func(ctx context.Context, direction string, f lookout.Finding) {
    // synchronous — keep cheap or dispatch async inside
    notify(direction, f)
})
```

Runs synchronously after the journal entry is written, regardless of the action policy. The pipeline runner (`internal/pipeline/runner_llm.go`) already wires this to `hooks.Dispatch(EventOnGuardrailTriggered, ...)` when both the DB and a journal emitter are available — see the [`on_guardrail_triggered` hook event](/guides/hooks).

## Output guard

The default output policy is sanitize-and-pass. The secrets scanner runs over every outbound text and returns a redacted copy alongside findings:

```go theme={null}
guard := lookout.OutputGuard(j)
redacted, err := guard(ctx, llmOutput)
// redacted may differ from llmOutput; err is non-nil only on journal emit failure
```

Why sanitize instead of block? Losing an entire response because one `sk-xxxx` slipped through is too disruptive; redaction preserves the response while surfacing the finding in the journal. Callers that want hard-block semantics should re-scan the returned text and refuse downstream.

Note: the output guard is NOT wired into `llm.Middleware`. Scanning output there would mutate text while leaving the provider-reported token counts intact -- a desync. Output scanning lives in the orchestrator streaming pipeline where text mutations are visible to the agent loop.

## Secrets detectors

Defined in `secrets.go`:

| Kind                  | Pattern                                                           |
| --------------------- | ----------------------------------------------------------------- |
| `secret_openai`       | `sk-[A-Za-z0-9]{32,}` / project-scoped `sk-proj-...`              |
| `secret_anthropic`    | `sk-ant-api03-...`                                                |
| `secret_aws`          | AKIA / ASIA IDs + 40-char secret pairs                            |
| `secret_github_pat`   | GitHub PAT — `ghp_`, `github_pat_`, or a legacy 40-char hex token |
| `secret_github_oauth` | GitHub OAuth token — `gho_`                                       |
| `secret_github_app`   | GitHub App installation token — `ghs_`                            |
| `secret_bearer_token` | `Bearer <long-b64>`                                               |
| `secret_password`     | `password=...`, `PASSWORD: ...` in env-var-shaped strings         |
| `secret_api_key`      | Generic `api_key=...`, `X-API-Key: ...`                           |

The stored finding carries `kind` and a redacted `detail` like `"openai API key (prefix: sk-...)"`. The raw match is never emitted.

## Tool-arg schema validation

`lookout.ValidateArgs(schemaJSON, args)` runs draft-07 JSON Schema over a tool call's arguments. Use it before dispatching:

```go theme={null}
if err := lookout.ValidateArgs(mySchema, toolArgs); err != nil {
    // Block the tool call; err describes which field didn't satisfy the schema.
}
```

`ValidateArgs` returns `error` (nil on pass). Unknown-key and type mismatches
produce a non-nil error with a message pointing at the offending path. Empty
schema = pass.

## Adding a detector

Add a new `Kind` constant in `types.go`, a regex/detector function in the relevant layer file (`injection.go`, `secrets.go`, `output.go`), and register it in the scanner's internal rule list. Tests in `lookout_test.go` use table-driven cases -- add one for every new kind.

Guidelines:

* Detectors must be pure (no network, no state). The one exception, Lakera, is gated behind an explicit `WithLakeraAPIKey` option.
* Never put the raw matched value in the `Finding.Matched` field for a secret detector. Use a prefix or a kind string.
* Severity should reflect operational risk, not "how confident the regex is". A false-positive `secret_anthropic` at `warn` is fine; one at `critical` would flood oncall.

## Gotchas

<Warning>
  **Output guard is not in the LLM middleware.** If you add a new streaming consumer, wire `OutputGuard` yourself or secrets may flow through to clients unredacted.
</Warning>

<Accordion title="More edge cases">
  * **Scope is required.** `emitGuardEntry` silently no-ops if `lookout.ScopeFromContext(ctx)` returns zero. Always wrap your request context with `lookout.WithScope(ctx, scope)` before invoking the guards -- the HTTP handler chain does this for you.
  * **Sanitize verdict** is not a block. The returned text is the safe version; the `err` is non-nil only if the journal emit fell over.
</Accordion>

## Related

* [LLM middleware](/guides/llm-middleware) -- where `InputGuard` is composed.
* [Harbormaster](/guides/harbormaster) -- complementary: approvals gate the *action*, Lookout sanitises the *content*.
* [Crew Journal](/guides/crew-journal) -- `guardrail.input_blocked` / `guardrail.output_blocked`.
