Feedback

Feedback is Crewship’s structured signal layer: thumb-up, thumb-down, “edit”, “regenerate”, “abandon”. It sits alongside open-vocabulary emoji reactions but exposes a tight six-value enum so the eval pipeline and the online sampler can query a stable target without LIKE-matching codepoints.

Why feedback (and not reactions)

Reactions are social — 👍 on a message means “I liked this,” same UX as Slack. The eval pipeline can’t reliably mine that signal across emoji families (👍 vs 👍🏼 vs ☹️ vs 😀), and it can’t capture the strongest training signal of all: a user editing the assistant’s answer to what they actually wanted. The message_feedback table (migration v96) gives consumers a typed signal column with a CHECK constraint, a free-form reason, and a trace_id link back to the OTel trace of the run that produced the message. Eval datasets join on trace_id to recover the (prompt, original answer, preferred answer, ts) tuple.

Signal vocabulary

Signal	Meaning	Source
`helpful`	Explicit thumb-up.	Click handler on `assistant-turn.tsx`.
`not_helpful`	Explicit thumb-down.	Click handler; most common feedback kind.
`inaccurate`	”The answer was wrong” — thumb-down + reason chip.	UI follow-up after thumb-down.
`unsafe`	”This was harmful / leaked secrets.”	UI follow-up after thumb-down.
`edit`	User replaced the assistant text with their own. Highest-quality training signal — not “this was bad” but “this is what I wanted.” `reason` holds the replacement text.	Inline editor.
`regenerate`	User asked for a different answer without editing. Weak negative signal.	Regenerate button.

Adding a new signal requires both a UI patch and a migration that widens the v96 CHECK clause — the handler validates the enum before any DB work so a renamed value fails fast with a readable 400 instead of a SQLite constraint violation.

Privacy contract

Feedback is private to the author. Even a workspace owner cannot list another member’s thumb-downs or “edit” reasons — same threat model as a Slack reaction or a Google Docs comment, where candid signal needs a closed loop with the eval pipeline (server-side, not API-exposed).

Operation	Visibility
`POST /api/v1/feedback`	Caller writes their own row. UPSERT semantics.
`GET /api/v1/feedback`	Caller sees ONLY their own rows. `WHERE user_id = ?` is the privacy gate; workspace membership is defense-in-depth.
`DELETE /api/v1/feedback`	Caller removes their own row.
Eval pipeline	Reads all rows server-side (joins on `trace_id`). Not API-exposed.

The UI’s optimistic-update store (stores/feedback-store.ts) persists per-user state to localStorage and reconciles via the API on submit. A 4xx/5xx response or transport rejection rolls back the optimistic flip so the local state never lies about a row that doesn’t exist on the server.

Trace correlation

Every POST accepts an optional trace_id. The intended end state is: the orchestrator stamps the active OTel trace id onto each assistant-message WebSocket event, the frontend lifts it onto ChatTurn.metadata.trace_id, and the feedback store passes it into the POST payload — so each feedback row lands indexed for WHERE trace_id = ? queries:

CREATE INDEX idx_feedback_trace ON message_feedback(trace_id) WHERE trace_id IS NOT NULL;

That powers the “show me every signal for this routine run” eval-mining query: collector trace ID → message_feedback.trace_id → all signals filed against that conversation.

End-to-end as of PR #450. The orchestrator → WebSocket → ChatTurn propagation is wired: internal/chatbridge/bridge.go calls telemetry.ResolveTrace(ctx) and stamps trace_id onto the "done" event metadata; hooks/use-chat.ts handleDoneEvent lifts it onto ChatTurn.metadata.trace_id; the feedback POST consumes it. New feedback rows land with trace_id populated whenever an OTel provider is configured. When no telemetry provider is configured (SpanContextFromContext returns invalid), the field is omitted — rows still land, just without the trace anchor — and the partial index WHERE trace_id IS NOT NULL keeps the lookup path cheap either way.

UI flow

The chat UI surfaces the signals through <TurnFeedbackActions> in components/features/chat/assistant-turn.tsx. Behaviour:

User clicks thumb → optimistic state flip in the zustand store → background POST /api/v1/feedback with message_id, chat_id, trace_id, signal.
On res.ok → state is the truth. On 4xx/5xx or network reject → roll back; user can retry.
User clicks the same thumb again → DELETE /api/v1/feedback?message_id=...&signal=... first, then clear local state on success. A failed delete keeps the local state pointing at “submitted” so a refresh reconciles back to truth.

The store is intentionally async because syncing thumbs through the streaming useChat would couple two unrelated concerns. The trade-off is a one-frame flicker on flaky networks; the upside is that the chat path stays unaware of feedback wiring.

API contract

See the Feedback API reference for the full endpoint catalog. Quick sketch:

# Submit
curl -X POST https://<host>/api/v1/feedback \
  -H 'Content-Type: application/json' \
  -d '{
    "message_id": "msg_abc",
    "chat_id":    "chat_xyz",
    "trace_id":   "4f3a...",
    "signal":     "not_helpful",
    "reason":     "Wrong tool — should have used the calendar instead."
  }'
# → 201 {"id":"fb_..."}

# Retract
curl -X DELETE 'https://<host>/api/v1/feedback?message_id=msg_abc&signal=not_helpful'
# → 204 (idempotent: also 204 if the row didn't exist)

Workspace re-anchoring

The chat_id parameter is optional on POST so eval widgets and CLI fallbacks without a chat context can still submit. Without it the server falls back to the caller’s most-recent workspace (ORDER BY workspace_members.created_at DESC LIMIT 1). A later POST against the same (message_id, user_id, signal) tuple that does carry chat_id re-anchors the row to the correct workspace via the UPSERT clause:

ON CONFLICT(message_id, user_id, signal) DO UPDATE SET
    workspace_id = excluded.workspace_id,
    reason       = excluded.reason,
    trace_id     = COALESCE(excluded.trace_id, message_feedback.trace_id),
    chat_id      = COALESCE(excluded.chat_id, message_feedback.chat_id)

Without workspace_id = excluded.workspace_id, a user with multi-workspace membership who first POSTed from a widget (fallback to workspace A) and later POSTed from their real chat (workspace B) would have the row stuck in A; eval queries scoped to workspace B would miss it.

Limits

Field	Limit	Why
`reason`	4096 chars	”edit” payloads carry the user’s replacement text; 4KB is generous without becoming a storage hazard.
`message_id`, `chat_id`, `trace_id`	256 chars each	10× longest realistic id (OTel trace_id is 32 hex chars; Crewship CUIDs ~25). Caps a hostile client that POSTs a 10 MB trace_id.
Rows per `(message_id, user_id, signal)` tuple	1	UNIQUE constraint. Re-POSTs UPSERT in place.

Limitations

No server-side hydration of the UI state on initial load — the frontend reads localStorage; a fresh browser starts with empty state. The server is the truth, and GET /api/v1/feedback?message_id=... exposes it for any UI that wants to rehydrate.
No “retracted” signal value — DELETE removes the row entirely. The eval pipeline distinguishes “row exists” from “row doesn’t exist,” so a once-thumbed-then-untoggled message reads as “no signal” rather than “explicitly retracted.” Future iterations may add a retracted value if the distinction starts mattering.
message_id ownership is not enforced — messages live in JSONL files (chats.jsonl_path), not a SQL table, so a per-POST file read would slow the path too much. Workspace membership is the trust boundary; cross-tenant probes are still blocked.

Online eval sampler — consumes feedback rows by trace_id for drift detection.
Tracing — the trace_id source.
Chat sessions — sibling reactions store for open emoji.

Get Started

Guides

Security

Configuration

Feedback

Feedback

Why feedback (and not reactions)

Signal vocabulary

Privacy contract

Trace correlation

UI flow

API contract

Workspace re-anchoring

Limits

Limitations

Get Started

Guides

Security

Configuration

Documentation Index

​Feedback

​Why feedback (and not reactions)

​Signal vocabulary

​Privacy contract

​Trace correlation

​UI flow

​API contract

​Workspace re-anchoring

​Limits

​Limitations

​Related

Feedback

Why feedback (and not reactions)

Signal vocabulary

Privacy contract

Trace correlation

UI flow

API contract

Workspace re-anchoring

Limits

Limitations

Related