-
Notifications
You must be signed in to change notification settings - Fork 2
Description
Summary
Implement feedback processing pipeline that improves triage/correlation/analysis agent performance using analyst corrections. Build automated retraining triggers, A/B testing framework for model updates, and feedback-driven playbook refinement.
Key Deliverables
Feedback Processor (agents/learning/feedback_processor.py): Async processing of analyst feedback to generate training examples. Process FP/TP corrections into labeled datasets, extract high-confidence analyst decisions for ReAct prompt refinement, identify systematic mispredictions requiring rule updates, and aggregate feedback metrics (inter-analyst agreement, confidence calibration errors).
Model Retraining Pipeline: Automated triggers for retraining when: 100+ new feedback samples accumulated, weekly scheduled retraining, FP rate exceeds threshold (>15%), or critical misclassification detected. Implement shadow testing of new models against current production (2-week observation period, 95% confidence interval for performance improvement).
Playbook Learning (agents/response/playbook_learner.py): Track response action effectiveness from analyst feedback. When analysts modify recommended actions, capture new action sequences, conditions triggering modifications, and success/failure outcomes. Auto-suggest playbook updates requiring manager approval.
Performance Dashboard: Add /api/v1/learning/metrics endpoint showing: model performance trends (precision/recall by week), feedback volume and quality scores, top analyst contributors, FP reduction rate, automated vs. manual decision distribution, A/B test results for model candidates.
Acceptance Criteria
Feedback processed within 5min of analyst submission
Retraining pipeline completes full cycle (data prep, train, test, deploy) in <4hrs
A/B testing safely compares model performance without impacting production decisions
Playbook suggestions show 20%+ action modification rate requiring at least 10 occurrences
Learning metrics dashboard updates daily with trend analysis