You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Sprint 9 Backlog — AI Trust, Information Security & Compliance Automation
Sprint: 9 (Apr 1 – Apr 14, 2026)
Target: v0.9.0
Status: Complete (Implementation 2026-03-28)
Related: Sprint 8 Retrospective | Execution Plan | Security Audit Findings (below)
Branch Policy: Production work branches from development, not main
Sprint Goals
AI Trust Framework — Validate LLM outputs structurally, detect potential hallucinations, and calibrate confidence scores so analysts can trust what they see
Information Security Hardening — Close the critical gaps found in the Sprint 8 security audit: CORS, rate limiting, prompt injection, input validation, security headers
Compliance Automation — Move from manual compliance evidence to CI-generated artifacts that prove regulatory compliance on every build
Recursive Completeness Check — Final sprint step verifying all tasks implemented, tested, documented, and gap-free
Security Audit Findings (Sprint 8 Audit Context)
The Sprint 8 completeness audit identified these security gaps in cddbs-prod:
Finding
Severity
Current State
No rate limiting on any endpoint
HIGH
Expensive operations (Gemini calls) exposed without throttling
CORS wildcard origins + credentials
HIGH
allow_origins="*" with allow_credentials=True — invalid per spec
Prompt injection via f-string interpolation
HIGH
User-provided topic/outlet inserted directly into Gemini prompts
Webhook URL accepts any string (SSRF)
HIGH
No URL format validation, no internal IP blocking
No authentication on any endpoint
CRITICAL
All endpoints publicly accessible
Missing security headers
MEDIUM
No CSP, HSTS, X-Frame-Options, X-Content-Type-Options
API keys accepted in request bodies
MEDIUM-HIGH
Keys can be logged/persisted in Report.data JSON
Error details exposed in health endpoint
MEDIUM
Database error strings returned to client
Sprint 9 addresses all HIGH/CRITICAL items except authentication (deferred to Sprint 10 — requires JWT, role model, session management, UI changes).
Validates every Gemini JSON response against expected schema before DB commit; catches missing fields, wrong types, out-of-range values (e.g., divergence_score outside 0-100); returns ValidationResult with errors list
9.1.2
Hallucination heuristic: source cross-reference
L
For Topic Mode: compares outlet claims in Gemini response against article titles/snippets from SerpAPI; flags claims with no source match as "ungrounded"; stores grounding_score (0.0-1.0) per outlet result
9.1.3
Confidence calibration metrics
M
Track historical accuracy: compare Gemini's divergence_score predictions against human feedback (existing feedback system); compute calibration curve data; expose via GET /metrics/calibration endpoint
9.1.4
Output reproducibility check
S
For identical inputs (same topic, same articles), run Gemini twice with temperature=0; store reproducibility_score in TopicRun (0.0-1.0 Jaccard similarity of technique lists); log discrepancies
In TopicRunDetail, key_claims with no source match rendered with warning icon and "⚠ Ungrounded — no matching source found" annotation
9.2.3
Wire trust indicators into TopicRunDetail
S
TrustIndicator appears on each outlet card; grounding_score in outlet result API response
P0 — Information Security Hardening
9.3 CORS Hardening
#
Task
Effort
Acceptance Criteria
9.3.1
Fix CORS configuration
S
ALLOWED_ORIGINS defaults to specific domains (Render URL, Cloudflare URL, localhost:5173); remove wildcard; allow_methods=["GET", "POST", "PUT", "DELETE", "OPTIONS"]; allow_headers restricted to Content-Type, Authorization
9.4 Rate Limiting
#
Task
Effort
Acceptance Criteria
9.4.1
Add slowapi rate limiting middleware
M
Install slowapi; configure global rate limit (60 requests/minute per IP); stricter limits on expensive endpoints: POST /analysis-runs (5/min), POST /topic-runs (3/min), POST /social-media/analyze (5/min); returns 429 with Retry-After header
9.4.2
Rate limit response handler
S
Custom 429 response with JSON body {"detail": "Rate limit exceeded", "retry_after": N}; logged for monitoring
9.5 Prompt Injection Prevention
#
Task
Effort
Acceptance Criteria
9.5.1
utils/input_sanitizer.py
M
sanitize_prompt_input(text: str) -> str — strips control characters, normalizes whitespace, escapes prompt-delimiter patterns (triple quotes, markdown separators, "IGNORE PREVIOUS INSTRUCTIONS"); truncates to max length; logs sanitization actions
9.5.2
Wire sanitizer into all prompt templates
S
Every f-string interpolation in topic_prompt_templates.py and prompt_templates.py passes through sanitize_prompt_input() before insertion; integration test verifies injection attempt is neutralized
9.5.3
External data sanitization
S
SerpAPI article titles/snippets and GDELT data sanitized before prompt insertion; strips HTML entities, limits field length to 500 chars
Health endpoint returns generic {"status": "unhealthy"} on DB error (no exception details); analysis run errors stored as generic categories ("api_error", "pipeline_error", "validation_error") not raw exception strings; API key values never appear in error messages
9.9 API Key Hygiene
#
Task
Effort
Acceptance Criteria
9.9.1
Remove API keys from request bodies
M
Remove google_api_key and serpapi_key from POST request schemas; use environment variables exclusively; remove keys from Report.data JSON storage; migration script to clean existing stored keys
9.9.2
Add API key presence validation at startup
S
App refuses to start if required API keys (GOOGLE_API_KEY, SERPAPI_KEY) not set; clear error message pointing to DEVELOPER.md
P1 — Compliance Automation
9.10 Automated Compliance Evidence
#
Task
Effort
Acceptance Criteria
9.10.1
compliance-report.yml CI workflow
L
Runs on every push to main/development; generates JSON report with: test count, lint status, SBOM present, vulnerability scan results, security headers verified, docs drift status, secrets scan status; uploads as CI artifact
9.10.2
Compliance evidence endpoint
M
GET /compliance/evidence returns machine-readable JSON: app version, deployment date, SBOM generation timestamp, last vulnerability scan, test count, AI disclosure status, data retention policy; authenticated (basic API key)
9.10.3
Data retention policy enforcement
S
Automated cleanup: analysis runs older than configurable retention period (default 90 days) are flagged; GET /compliance/retention shows retention status; actual deletion requires manual trigger (safety)
9.11 Regulatory Documentation Update
#
Task
Effort
Acceptance Criteria
9.11.1
Update EU AI Act compliance doc
M
Document all Sprint 9 AI trust measures in compliance-practices/eu_ai_act.md; map grounding score to Art. 50 transparency; document output validation as quality management (Art. 9)
9.11.2
Update CRA compliance doc
S
Document security hardening measures in compliance-practices/cyber_resilience_act_cra.md; rate limiting, CORS, input validation, security headers mapped to CRA articles
9.11.3
Information security practices document
M
New document compliance-practices/information_security.md covering: OWASP Top 10 for LLM Applications mapping, prompt injection prevention, SSRF prevention, rate limiting rationale, security headers explanation
Assess backend migration status (Render alternatives)
Acceptance Criteria (Sprint-Level)
AI Trust
Every Gemini response validated structurally before DB commit
Ungrounded claims flagged with warning in UI
Grounding score visible per outlet in TopicRunDetail
Output validation errors logged and retrievable
Information Security
CORS rejects requests from unauthorized origins
Rate limiting active on all endpoints (429 returned on excess)
Prompt injection attempts neutralized (test with known injection patterns)
Webhook URLs validated and private IPs blocked
Security headers present on all responses
API keys never appear in request bodies or error messages
Compliance
Compliance evidence artifact generated on every CI run
Machine-readable compliance endpoint accessible
Information security compliance document created
All Sprint 9 measures mapped to regulations
Quality
≥30 new tests (≥244 total passing)
All CI workflows green
No documentation drift
Risk Assessment
Risk
Mitigation
Rate limiting too aggressive for legitimate use
Start with generous limits (60/min global, 5/min analysis); tune based on real usage patterns; document override via env var
Input sanitizer breaks legitimate topics
Whitelist-based approach: allow alphanumeric + common punctuation; sanitizer returns cleaned text, never rejects; log all sanitizations for review
Output validator rejects valid Gemini responses
Lenient validation: required fields + type checks only; optional fields allowed to be null; validation errors logged but don't block pipeline
Grounding score gives false confidence
Clearly label as "heuristic — not definitive"; source cross-reference is title/snippet matching, not semantic; document limitations
Security headers break frontend
API-only CSP (default-src 'self'); frontend served separately via Cloudflare Workers with its own CSP
Compliance endpoint exposes sensitive info
Endpoint returns operational metadata only (no PII, no analysis content, no API keys); basic auth protection
Tech Stack (New Dependencies)
Package
Purpose
Tier
slowapi
Rate limiting middleware for FastAPI
Runtime
No other new runtime dependencies. AI trust framework uses existing google-genai SDK + custom validation logic. Input sanitizer is pure Python (re module). Security headers middleware is custom FastAPI middleware.
Architecture Decisions
Why AI Trust Before Auth?
Authentication (Sprint 10) controls who can use the system. AI trust (Sprint 9) controls what the system tells people. For a disinformation detection system, output integrity is more critical than access control — a trusted analyst using unreliable AI output is worse than an unauthorized user seeing reliable output.
Why Not Use a Dedicated AI Safety Library?
Libraries like Guardrails AI and NeMo Guardrails add complexity and dependencies. CDDBS's trust needs are specific: validate JSON structure, cross-reference claims against source material, track confidence calibration. Custom implementation is more maintainable and auditable for compliance purposes.
Prompt Injection: Sanitize vs. Separate
Two approaches to prompt injection:
Sanitize inputs — strip dangerous patterns before insertion (chosen)
Separate user data — use system/user message separation
We use approach 1 because Gemini's genai.generate_content() doesn't support the OpenAI-style system/user message separation in the same way. The system_instruction parameter is set once; article data and topics go into the content. Sanitization is the pragmatic choice.
OWASP Top 10 for LLM Applications — Sprint 9 Coverage
OWASP LLM Risk
Sprint 9 Task
Coverage
LLM01: Prompt Injection
9.5.1-9.5.3
Input sanitization + external data sanitization
LLM02: Insecure Output Handling
9.1.1
Structural validation before DB commit
LLM03: Training Data Poisoning
N/A
We don't fine-tune; using Gemini as-is
LLM04: Model Denial of Service
9.4.1
Rate limiting on endpoints that trigger Gemini calls
LLM05: Supply Chain Vulnerabilities
Sprint 8
SBOM + pip-audit + SHA-pinned Actions (done)
LLM06: Sensitive Information Disclosure
9.8.1, 9.9.1
Error sanitization + API key removal from requests
LLM07: Insecure Plugin Design
N/A
No plugins/tools
LLM08: Excessive Agency
N/A
LLM has no ability to execute actions
LLM09: Overreliance
9.1.2, 9.2.1-9.2.2
Grounding score + ungrounded claim highlighting
LLM10: Model Theft
N/A
Using cloud API, not hosting model
Definition of Done
All P0 and P1 tasks completed and tested
Recursive completeness check (9.26) executed and all items checked
CI green on all workflows (ci.yml, branch-policy.yml, secret-scan.yml, sbom.yml, compliance-report.yml)
DEVELOPER.md and CHANGELOG.md updated
Sprint 9 retrospective written
Compliance log updated
No regression in Sprint 1-8 functionality
Security audit findings (HIGH/CRITICAL) resolved
Production patch exported to patches/sprint9_production_changes.patch