Skip to content

feat: investigation feedback mechanism + fix DP checklist auto-complete#101

Merged
jacoblee-io merged 3 commits intomainfrom
fix/dp-checklist-auto-complete
Mar 13, 2026
Merged

feat: investigation feedback mechanism + fix DP checklist auto-complete#101
jacoblee-io merged 3 commits intomainfrom
fix/dp-checklist-auto-complete

Conversation

@LikiosSedo
Copy link
Collaborator

@LikiosSedo LikiosSedo commented Mar 13, 2026

Summary

  • Investigation feedback mechanism: Add a low-friction feedback loop that lets SREs rate investigation accuracy (confirmed/corrected/rejected). Feedback modifies retrieval weighting so correct diagnoses get boosted and wrong ones get suppressed in future investigations.
  • DP checklist auto-complete fix: Fix checklist card stuck at 2/4 after investigation completes when agent doesn't explicitly call manage_checklist for remaining phases.

Changes

Investigation Feedback (commit 1 + 2)

  • Add FeedbackStatus type and FEEDBACK_SIGNALS constant (1.5 / 0.5 / 0.1)
  • Add 3 column migrations: feedback_signal, feedback_note, feedback_at
  • Add getInvestigationById() and updateInvestigationFeedback() to indexer
  • Apply feedback weighting in pattern aggregation, keyword search, and validated hypothesis extraction
  • Plumb investigationId through InvestigationResult → tool details
  • Create investigation_feedback tool and register in agent-factory
  • Add Phase 5 feedback guidance to deep-investigation SKILL.md
  • Add WebUI feedback strip (Correct / Partial / Wrong) to InvestigationCard

Review fixes (commit 2)

  • extractValidatedHypotheses: separate rawCount (display) from weightedScore (sorting) to avoid inflated/deflated occurrence counts
  • InvestigationCard: extract handleFeedbackSubmit() to deduplicate Enter-key and Submit button logic

DP Checklist (commit 3)

  • Relax prompt_done safety-net guard: only block auto-complete when hypotheses phase isn't done (waiting for user confirmation), not when deep_search isn't done

Test plan

  • tsc --noEmit passes
  • Run deep investigation → agent presents findings → say "looks right" → verify investigation_feedback tool called → check SQLite feedback_signal updated
  • Complete investigation in WebUI → click feedback button on InvestigationCard → verify metadata persisted + agent processes feedback
  • Verify DP checklist auto-clears after investigation completes (no stuck 2/4)
  • Create 3 investigations, reject one → verify rejected ranks lower in searchInvestigations and getInvestigationPatterns

Add a low-friction feedback loop that lets SREs rate investigation accuracy.
Feedback modifies retrieval weighting so correct diagnoses get boosted and
wrong ones get suppressed in future investigations.

- Add FeedbackStatus type and FEEDBACK_SIGNALS constant (1.5/0.5/0.1)
- Add 3 column migrations: feedback_signal, feedback_note, feedback_at
- Add getInvestigationById() and updateInvestigationFeedback() to indexer
- Apply feedback weighting in pattern aggregation, keyword search, and
  validated hypothesis extraction
- Plumb investigationId through InvestigationResult → tool details
- Create investigation_feedback tool and register in agent-factory
- Add Phase 5 feedback guidance to deep-investigation SKILL.md
- Add WebUI feedback strip (Correct/Partial/Wrong) to InvestigationCard
Copy link
Collaborator

@jacoblee-io jacoblee-io left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two distinct features in one PR — the checklist fix is clean, but the feedback mechanism has a few issues.

PR description only covers half the changes

The title/body describe the checklist auto-complete fix (~10 lines in usePilot.ts), but the bulk of the PR is the investigation feedback mechanism (~300 lines across 10 files: new tool, schema migration, indexer methods, engine weighting, UI component, skill doc). These are independent features and would be easier to review as separate PRs. At minimum, the PR description should cover the feedback mechanism.


Checklist auto-complete fix ✅

The guard condition change from deep_search.status === 'done' to hypotheses.status !== 'done' && !== 'skipped' is correct. The old condition created a deadlock since deep_search was the stuck item itself.


Investigation feedback mechanism — issues

1. Bug: extractValidatedHypotheses count semantics

count changed from integer (number of occurrences) to float (weighted by feedbackSignal). The display label still says "validated N times":

`- "${displayText}" (validated ${Math.round(count)} time${...})`

Example: a hypothesis validated in 3 confirmed investigations:

  • count = 1.5 + 1.5 + 1.5 = 4.5 → "validated 5 times" (was actually 3)

A rejected investigation's hypothesis:

  • count = 0.1 → "validated 0 times"

Suggestion: keep count as a raw integer for display, apply feedbackSignal weighting separately in the sort key:

existing.count++;
existing.weightedScore = (existing.weightedScore ?? 0) + feedbackSignal;
// Sort by weightedScore, display raw count

2. Nit: feedback via chat message is indirect

The UI sends feedback as a user chat message ([investigation feedback: confirmed] investigationId=...) — the agent must interpret it and call investigation_feedback. If the agent misinterprets or responds conversationally, the DB update doesn't happen (only the UI metadata is saved via updateMessageMeta).

This follows the existing HypothesesCard pattern so it's architecturally consistent, but worth noting that the updateMessageMeta call creates a false sense of persistence — the real weight adjustment depends on the LLM cooperating.

3. Nit: duplicate submit logic in InvestigationCard

The Enter-key handler and Submit button onClick contain identical logic (~6 lines each). Consider extracting to a handleSubmit() function.


Everything else looks good — schema migration is safe (ALTER TABLE ADD COLUMN with existence checks), getInvestigationById/updateInvestigationFeedback use parameterized queries, FEEDBACK_SIGNALS constants are well-chosen, and the TypeBox tool auto-adapts for claude-sdk brain via adaptToolsForSdk.

Copy link
Collaborator

@chent1996 chent1996 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two clean changes:

  1. Investigation feedback: Schema migration safe (3 nullable columns with existence check). Feedback signals (1.5/0.5/0.1) applied consistently across all 3 retrieval paths (pattern aggregation, keyword search, validated hypothesis extraction). Tool validates investigation existence before update. investigationId plumbing from engine → tool result → details → frontend is complete.

  2. DP checklist auto-complete: Guard condition change from deep_search to hypotheses phase is logically correct — only block auto-complete when waiting for user hypothesis confirmation.

Minor observation: frontend feedback is sent via sendMessage() (chat message to agent), with optimistic UI update via updateMessageMeta. If the agent doesn't act on the message, the investigation DB won't be updated but UI shows success. Acceptable trade-off for conversational UX. Non-blocking.

LGTM.

@LikiosSedo LikiosSedo force-pushed the fix/dp-checklist-auto-complete branch from 8cdaa89 to 0ca1914 Compare March 13, 2026 08:27
- extractValidatedHypotheses: separate rawCount (display) from
  weightedScore (sorting) to avoid inflated/deflated occurrence counts
- InvestigationCard: extract handleFeedbackSubmit() to deduplicate
  Enter-key and Submit button logic
The safety-net in prompt_done required deep_search checklist item to be
marked 'done' before auto-completing remaining items. But if the agent
presented findings without explicitly calling manage_checklist to mark
deep_search as done, the checklist card stayed stuck at 2/4 indefinitely.

Relax the guard condition: only block auto-complete if the hypotheses
phase isn't done yet (agent waiting for user confirmation). Once hypotheses
are done, any remaining in_progress/pending items get auto-completed
when the agent finishes responding.
@LikiosSedo LikiosSedo force-pushed the fix/dp-checklist-auto-complete branch from 0ca1914 to e5e2a66 Compare March 13, 2026 08:31
@LikiosSedo LikiosSedo changed the title fix: auto-complete DP checklist when agent ends response feat: investigation feedback mechanism + fix DP checklist auto-complete Mar 13, 2026
Copy link
Collaborator

@jacoblee-io jacoblee-io left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previous issues resolved:

  1. Count semantics ✅ — Split into rawCount (integer, display) and weightedScore (float, sorting). "validated N times" now reflects actual occurrences. Sorting uses feedback-weighted score as intended.
  2. Duplicate submit logic ✅ — Extracted to handleFeedbackSubmit().

LGTM.

@jacoblee-io jacoblee-io merged commit 6f41f2d into main Mar 13, 2026
3 checks passed
@jacoblee-io jacoblee-io deleted the fix/dp-checklist-auto-complete branch March 13, 2026 08:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants