-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
area/llm-usabilityLLM generation reliability and token efficiencyLLM generation reliability and token efficiencypriority/highHigh priorityHigh prioritytype/taskImplementation taskImplementation task
Description
Goal
Raise LLM generation reliability from current 0% verify/semantic success to usable levels.
Scope
- Introduce a two-stage generation flow: initial generation + automatic verify-guided repair loop.
- Feed structured verifier errors back into subsequent attempts.
- Cap attempts and track turns-to-pass.
Target
- Verify success >=70% and semantic success >=50% on current 6-task corpus for baseline local model.
Acceptance
- New repair loop integrated into
tests/llm_usability_bench.sh(or adapter layer). - Report includes pass-at-k metrics.
- Updated docs/wiki with before/after comparison.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
area/llm-usabilityLLM generation reliability and token efficiencyLLM generation reliability and token efficiencypriority/highHigh priorityHigh prioritytype/taskImplementation taskImplementation task