Documentation Index
Fetch the complete documentation index at: https://docs.crewship.ai/llms.txt
Use this file to discover all available pages before exploring further.
Feedback
Feedback is Crewship’s structured signal layer: thumb-up, thumb-down, “edit”, “regenerate”, “abandon”. It sits alongside open-vocabulary emoji reactions but exposes a tight six-value enum so the eval pipeline and the online sampler can query a stable target without LIKE-matching codepoints.Why feedback (and not reactions)
Reactions are social — 👍 on a message means “I liked this,” same UX as Slack. The eval pipeline can’t reliably mine that signal across emoji families (👍 vs 👍🏼 vs ☹️ vs 😀), and it can’t capture the strongest training signal of all: a user editing the assistant’s answer to what they actually wanted. Themessage_feedback table (migration v96) gives consumers a typed signal column with a CHECK constraint, a free-form reason, and a trace_id link back to the OTel trace of the run that produced the message. Eval datasets join on trace_id to recover the (prompt, original answer, preferred answer, ts) tuple.
Signal vocabulary
| Signal | Meaning | Source |
|---|---|---|
helpful | Explicit thumb-up. | Click handler on assistant-turn.tsx. |
not_helpful | Explicit thumb-down. | Click handler; most common feedback kind. |
inaccurate | ”The answer was wrong” — thumb-down + reason chip. | UI follow-up after thumb-down. |
unsafe | ”This was harmful / leaked secrets.” | UI follow-up after thumb-down. |
edit | User replaced the assistant text with their own. Highest-quality training signal — not “this was bad” but “this is what I wanted.” reason holds the replacement text. | Inline editor. |
regenerate | User asked for a different answer without editing. Weak negative signal. | Regenerate button. |
Privacy contract
Feedback is private to the author. Even a workspace owner cannot list another member’s thumb-downs or “edit” reasons — same threat model as a Slack reaction or a Google Docs comment, where candid signal needs a closed loop with the eval pipeline (server-side, not API-exposed).| Operation | Visibility |
|---|---|
POST /api/v1/feedback | Caller writes their own row. UPSERT semantics. |
GET /api/v1/feedback | Caller sees ONLY their own rows. WHERE user_id = ? is the privacy gate; workspace membership is defense-in-depth. |
DELETE /api/v1/feedback | Caller removes their own row. |
| Eval pipeline | Reads all rows server-side (joins on trace_id). Not API-exposed. |
stores/feedback-store.ts) persists per-user state to localStorage and reconciles via the API on submit. A 4xx/5xx response or transport rejection rolls back the optimistic flip so the local state never lies about a row that doesn’t exist on the server.
Trace correlation
Every POST accepts an optionaltrace_id. The intended end state is: the orchestrator stamps the active OTel trace id onto each assistant-message WebSocket event, the frontend lifts it onto ChatTurn.metadata.trace_id, and the feedback store passes it into the POST payload — so each feedback row lands indexed for WHERE trace_id = ? queries:
message_feedback.trace_id → all signals filed against that conversation.
End-to-end as of PR #450. The orchestrator → WebSocket →
ChatTurn propagation is wired: internal/chatbridge/bridge.go calls telemetry.ResolveTrace(ctx) and stamps trace_id onto the "done" event metadata; hooks/use-chat.ts handleDoneEvent lifts it onto ChatTurn.metadata.trace_id; the feedback POST consumes it. New feedback rows land with trace_id populated whenever an OTel provider is configured. When no telemetry provider is configured (SpanContextFromContext returns invalid), the field is omitted — rows still land, just without the trace anchor — and the partial index WHERE trace_id IS NOT NULL keeps the lookup path cheap either way.UI flow
The chat UI surfaces the signals through<TurnFeedbackActions> in components/features/chat/assistant-turn.tsx. Behaviour:
- User clicks thumb → optimistic state flip in the zustand store → background
POST /api/v1/feedbackwithmessage_id,chat_id,trace_id,signal. - On
res.ok→ state is the truth. On 4xx/5xx or network reject → roll back; user can retry. - User clicks the same thumb again →
DELETE /api/v1/feedback?message_id=...&signal=...first, then clear local state on success. A failed delete keeps the local state pointing at “submitted” so a refresh reconciles back to truth.
useChat would couple two unrelated concerns. The trade-off is a one-frame flicker on flaky networks; the upside is that the chat path stays unaware of feedback wiring.
API contract
See the Feedback API reference for the full endpoint catalog. Quick sketch:Workspace re-anchoring
Thechat_id parameter is optional on POST so eval widgets and CLI fallbacks without a chat context can still submit. Without it the server falls back to the caller’s most-recent workspace (ORDER BY workspace_members.created_at DESC LIMIT 1).
A later POST against the same (message_id, user_id, signal) tuple that does carry chat_id re-anchors the row to the correct workspace via the UPSERT clause:
workspace_id = excluded.workspace_id, a user with multi-workspace membership who first POSTed from a widget (fallback to workspace A) and later POSTed from their real chat (workspace B) would have the row stuck in A; eval queries scoped to workspace B would miss it.
Limits
| Field | Limit | Why |
|---|---|---|
reason | 4096 chars | ”edit” payloads carry the user’s replacement text; 4KB is generous without becoming a storage hazard. |
message_id, chat_id, trace_id | 256 chars each | 10× longest realistic id (OTel trace_id is 32 hex chars; Crewship CUIDs ~25). Caps a hostile client that POSTs a 10 MB trace_id. |
Rows per (message_id, user_id, signal) tuple | 1 | UNIQUE constraint. Re-POSTs UPSERT in place. |
Limitations
- No server-side hydration of the UI state on initial load — the frontend reads localStorage; a fresh browser starts with empty state. The server is the truth, and
GET /api/v1/feedback?message_id=...exposes it for any UI that wants to rehydrate. - No “retracted” signal value — DELETE removes the row entirely. The eval pipeline distinguishes “row exists” from “row doesn’t exist,” so a once-thumbed-then-untoggled message reads as “no signal” rather than “explicitly retracted.” Future iterations may add a
retractedvalue if the distinction starts mattering. message_idownership is not enforced — messages live in JSONL files (chats.jsonl_path), not a SQL table, so a per-POST file read would slow the path too much. Workspace membership is the trust boundary; cross-tenant probes are still blocked.
Related
- Online eval sampler — consumes feedback rows by
trace_idfor drift detection. - Tracing — the
trace_idsource. - Chat sessions — sibling reactions store for open emoji.