-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
enhancementNew feature or requestNew feature or requestneeds-humanRequires human judgment or design input before implementationRequires human judgment or design input before implementation
Description
Problem
The reflections loop has a structural gap: agents can log a decision or prediction at bet start, but the validation and resolution counterparts happen during PR review — a phase owned by sensei (the orchestrator), not an agent.
Currently there is no enforced or even suggested path for sensei to write validation or resolution reflection records when reviewing a PR. This means:
reflections.jsonlfiles exist but contain zerovalidationorresolutionentriesBeltCalculator.readRunMetrics()readspredictionOutcomePairsfromtype: 'validation'records — always 0- The calibration loop is broken: predictions are made, outcomes are observed, but nothing closes the loop
Example scenario
- Agent bets that "refactoring X will reduce test time by 20%" — logs a
predictionobservation - Agent completes work, PR opened
- Sensei reviews PR, CI shows test time reduced by 18%
- Gap: no mechanism or reminder for sensei to log a
validationreflection linking the outcome back to the prediction - Belt calculator sees 0 prediction-outcome pairs; calibration accuracy stays at 0
Desired behavior
- When sensei runs
kata kiai complete <run-id> --success(or--failure), the CLI should prompt (or accept flags) for:- Was there an active prediction for this run? Did it validate?
- Were any frictions resolved? (resolution record)
- Alternatively,
kata kansatsu record validation --run-id <id> --prediction-id <id> --accuracy 0.9should be surfaced in thekata kiai completeflow - Agent context should remind agents to log predictions at start AND remind sensei to close the loop at completion
Notes
- This is the same gap that causes
predictionOutcomePairsto always be 0 in belt snapshots - Related: bug: reflections.jsonl and decision-outcomes.jsonl never written — belt metrics broken #365 (reflections write path), design: agent execution model — document current reality and define target architecture #371 (agent execution model)
- The fix is partly CLI UX (prompt at
kiai complete) and partly documentation/agent-context (remind agents what to log)
Acceptance criteria
-
kata kiai completeflow includes a validation prompt or--validate-predictionflag - Agent context section explicitly states which reflections sensei vs agent owns
- At least one
validationrecord written per completed run that had a prediction
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or requestneeds-humanRequires human judgment or design input before implementationRequires human judgment or design input before implementation