valITino · valITino · Mar 7, 2026 · Mar 7, 2026 · Mar 7, 2026
diff --git a/blhackbox/models/aggregated_payload.py b/blhackbox/models/aggregated_payload.py
@@ -54,7 +54,12 @@ class ServiceEntry(BaseModel):
 
 
 class VulnerabilityEntry(BaseModel):
-    """A single vulnerability finding."""
+    """A single vulnerability finding with mandatory PoC data.
+
+    Every finding MUST include proof-of-concept information.  A finding
+    without a reproducible PoC is not valid and should be downgraded to
+    severity "info" with a note that exploitation could not be confirmed.
+    """
 
     id: str = ""
     title: str = ""
@@ -64,7 +69,30 @@ class VulnerabilityEntry(BaseModel):
     port: int = 0
     description: str = ""
     references: list[str] = Field(default_factory=list)
-    evidence: str = ""
+    evidence: str = Field(
+        default="",
+        description=(
+            "Raw tool output, HTTP response, or terminal output proving "
+            "the vulnerability exists.  Must be concrete, not theoretical."
+        ),
+    )
+    poc_steps: list[str] = Field(
+        default_factory=list,
+        description=(
+            "Ordered reproduction steps that allow an independent tester "
+            "to confirm the finding.  Example: "
+            '["1. Navigate to /login", '
+            "\"2. Enter ' OR 1=1-- in username\", "
+            '"3. Observe 302 redirect to /admin"]'
+        ),
+    )
+    poc_payload: str = Field(
+        default="",
+        description=(
+            "The exact payload, command, or HTTP request used to exploit "
+            "the vulnerability.  Must be copy-pasteable."
+        ),
+    )
     tool_source: str = ""
     likely_false_positive: bool = False
 

diff --git a/blhackbox/prompts/agents/ingestionagent.md b/blhackbox/prompts/agents/ingestionagent.md
@@ -69,7 +69,9 @@ explanation text. The JSON must match this schema exactly:
       "port": 80,
       "description": "Path traversal allowing file read outside webroot",
       "references": ["https://nvd.nist.gov/vuln/detail/CVE-2021-12345"],
-      "evidence": "GET /..%2f..%2fetc/passwd returned 200",
+      "evidence": "GET /..%2f..%2fetc/passwd returned 200 with body: root:x:0:0:root:/root:/bin/bash ...",
+      "poc_steps": ["1. Send GET request to /..%2f..%2fetc/passwd", "2. Observe HTTP 200 response with /etc/passwd contents"],
+      "poc_payload": "curl -k 'https://192.168.1.1/..%2f..%2fetc/passwd'",
       "tool_source": "nikto"
     }
   ],
@@ -143,12 +145,16 @@ explanation text. The JSON must match this schema exactly:
 - Extract the HTTP method and URL from each finding
 - Note outdated server versions as vulnerabilities (severity: "info" or "low")
 - Extract missing security headers and map to `http_headers[].missing_security_headers`
+- **PoC**: Use the nikto finding URL + method as `poc_payload`, the full nikto output
+  line as `evidence`
 
 ### sqlmap
 - Extract confirmed injection points as critical vulnerabilities
 - Include the injection type (blind, error-based, time-based, UNION)
 - Include the DBMS type and version if detected
 - Each confirmed injection point = severity "critical"
+- **PoC**: Extract the sqlmap command as `poc_payload`, the injection point URL + parameter
+  as step 1 of `poc_steps`, the DBMS confirmation as `evidence`
 
 ### wpscan
 - Map plugin/theme vulnerabilities to `vulnerabilities[]` with CVE IDs
@@ -158,6 +164,7 @@ explanation text. The JSON must match this schema exactly:
 ### hydra/medusa
 - Each successful login goes in `credentials[]`
 - Include the service type (ssh, ftp, http-form, etc.)
+- **PoC**: The hydra/medusa command as `poc_payload`, "Successful login: user:pass" as `evidence`
 
 ### SSL/TLS scans
 - Map to `ssl_certs[]`
@@ -175,6 +182,14 @@ explanation text. The JSON must match this schema exactly:
 7. Treat informational findings as severity "info" — do not skip them.
 8. Arrays that have no data should be `[]`, objects with no data should be `{}`.
 9. Output ONLY valid JSON — no markdown fences, no commentary.
+10. **Extract PoC data for every vulnerability:**
+    - `evidence`: Raw tool output or HTTP response proving the finding (never empty for confirmed vulns).
+    - `poc_steps`: Ordered list of steps to reproduce. Extract from tool output where possible
+      (e.g., sqlmap shows injection steps, nikto shows the request path).
+    - `poc_payload`: The exact command, payload, or HTTP request used. Extract from tool
+      invocation or output (e.g., the sqlmap command line, the nikto finding URL).
+    - If PoC data is not available from the tool output, set `poc_steps: []` and `poc_payload: ""`
+      but ALWAYS populate `evidence` with the raw tool output that detected the finding.
 
 ## Example
 

diff --git a/blhackbox/prompts/agents/processingagent.md b/blhackbox/prompts/agents/processingagent.md
@@ -79,6 +79,9 @@ explanation text. The JSON must match this schema:
   increase confidence in the vulnerability.
 - Correlate technology detection (whatweb) with vulnerability reports — if a CVE
   applies to a detected technology version, flag it.
+- **When merging duplicate findings, preserve the best PoC data:** keep the entry
+  with the most complete `poc_steps`, `poc_payload`, and `evidence`. Merge evidence
+  from both tools (e.g., "Detected by: nikto, nuclei. nikto output: ... nuclei output: ...").
 
 ### 4. Severity Assessment
 Reassess severity using these pentesting-specific rules:
@@ -127,11 +130,22 @@ Populate `attack_surface` by counting:
 - `ssl_issues`: SSL/TLS problems (expired, weak cipher, old protocol)
 - `high_value_targets`: List of the most interesting targets for further exploitation
 
-### 8. Data Preservation
+### 8. PoC Data Preservation
+**Never discard PoC data.** Every vulnerability entry must retain its `evidence`,
+`poc_steps`, and `poc_payload` fields through processing. A finding without PoC
+evidence is not a valid finding.
+
+- When deduplicating, keep the PoC with the most detail.
+- When compressing low-severity findings, still preserve at least the `evidence` field.
+- If a finding has empty `poc_steps` and `poc_payload`, it must be flagged with
+  `"likely_false_positive": true` unless the `evidence` field alone is sufficient
+  to confirm the vulnerability.
+
+### 9. Data Preservation
 Never discard data with security value. If an error or anomaly could indicate a
 security control (WAF, IDS, rate limiter, geo-block), keep it in error_log.
 
-### 9. Output
+### 10. Output
 Output ONLY valid JSON — no markdown fences, no commentary.
 
 ## Example
@@ -161,7 +175,7 @@ Output ONLY valid JSON — no markdown fences, no commentary.
     "hosts": [{"ip": "10.0.0.1", "hostname": "target.com", "os": "Linux 5.4", "ports": [{"port": 80, "protocol": "tcp", "state": "open", "service": "http", "version": "nginx/1.18.0", "banner": "", "nse_scripts": {"http-title": "Login Page"}}, {"port": 443, "protocol": "tcp", "state": "filtered", "service": "", "version": "", "banner": "", "nse_scripts": {}}]}],
     "ports": [],
     "services": [],
-    "vulnerabilities": [{"id": "CVE-2021-3449", "title": "OpenSSL DoS", "severity": "high", "cvss": 7.5, "host": "10.0.0.1", "port": 443, "description": "NULL pointer dereference in signature_algorithms processing. Confirmed by multiple tools.", "references": ["https://nvd.nist.gov/vuln/detail/CVE-2021-3449"], "evidence": "Detected by: nikto, nuclei", "tool_source": "nikto,nuclei", "likely_false_positive": false}],
+    "vulnerabilities": [{"id": "CVE-2021-3449", "title": "OpenSSL DoS", "severity": "high", "cvss": 7.5, "host": "10.0.0.1", "port": 443, "description": "NULL pointer dereference in signature_algorithms processing. Confirmed by multiple tools.", "references": ["https://nvd.nist.gov/vuln/detail/CVE-2021-3449"], "evidence": "Detected by: nikto, nuclei. nikto: + OpenSSL/1.1.1j appears vulnerable to CVE-2021-3449. nuclei: [CVE-2021-3449] [high] https://10.0.0.1:443", "poc_steps": ["1. Run nikto against target on port 443", "2. Run nuclei with CVE-2021-3449 template against target", "3. Both tools confirm the vulnerability"], "poc_payload": "nuclei -u https://10.0.0.1 -t CVE-2021-3449.yaml", "tool_source": "nikto,nuclei", "likely_false_positive": false}],
     "endpoints": [{"url": "/admin", "method": "GET", "status_code": 200, "content_length": 5432, "redirect": ""}, {"url": "/api/v1/users", "method": "GET", "status_code": 401, "content_length": 45, "redirect": ""}],
     "subdomains": ["mail.target.com", "dev.target.com", "staging.target.com"],
     "technologies": [],

diff --git a/blhackbox/prompts/agents/synthesisagent.md b/blhackbox/prompts/agents/synthesisagent.md
@@ -122,6 +122,9 @@ No preamble, no markdown fences, no explanation text.
 - If the same host appears with different port lists, merge the port lists (union).
 - If tool_source differs, combine them ("nikto,nuclei").
 - For version strings, prefer the more specific version (e.g., "1.18.0" over "1.18").
+- **When merging vulnerabilities, keep the most complete PoC data** — prefer the entry
+  with non-empty `poc_steps`, `poc_payload`, and `evidence`. If both have PoC data,
+  merge the evidence from both tools.
 
 ### 3. Error Log Merging
 - Take error_log from Processing Agent output.
@@ -163,13 +166,22 @@ Generate prioritized remediation steps:
   - `architecture`: Design-level change (network segmentation, auth system overhaul)
   - `process`: Operational change (credential rotation, monitoring, incident response)
 
-### 7. Completeness
+### 7. PoC Validation
+- **Every vulnerability with severity > "info" MUST have PoC data.**
+- Check that `evidence` is non-empty for all confirmed vulnerabilities.
+- Check that `poc_steps` has at least one step for critical and high findings.
+- If a vulnerability has severity ≥ "low" but empty `evidence`, `poc_steps`, and
+  `poc_payload`, downgrade it to "info" and add a note in the description:
+  "Downgraded: exploitation could not be confirmed — no PoC evidence available."
+- A finding without a PoC is not a valid finding.
+
+### 8. Completeness
 - Every field in the schema MUST be present.
 - Missing arrays → `[]`. Missing strings → `""`. Missing numbers → `0`.
 - Metadata: populate what you can from the input. Set fields you cannot determine
   to their zero values.
 
-### 8. Output
+### 9. Output
 Output ONLY valid JSON — no markdown fences, no commentary.
 
 ## Example
@@ -189,7 +201,7 @@ Output ONLY valid JSON — no markdown fences, no commentary.
     "findings": {
       "hosts": [{"ip": "10.0.0.1", "hostname": "target.com", "os": "Linux 5.4", "ports": [{"port": 80, "protocol": "tcp", "state": "open", "service": "http", "version": "nginx/1.18.0", "banner": "", "nse_scripts": {"http-title": "Login Page"}}]}],
       "subdomains": ["mail.target.com", "dev.target.com"],
-      "vulnerabilities": [{"id": "CVE-2021-3449", "title": "OpenSSL DoS", "severity": "high", "cvss": 7.5, "host": "10.0.0.1", "port": 443, "description": "OpenSSL denial of service. Confirmed by multiple tools.", "references": [], "evidence": "Detected by: nikto, nuclei", "tool_source": "nikto,nuclei"}],
+      "vulnerabilities": [{"id": "CVE-2021-3449", "title": "OpenSSL DoS", "severity": "high", "cvss": 7.5, "host": "10.0.0.1", "port": 443, "description": "OpenSSL denial of service. Confirmed by multiple tools.", "references": [], "evidence": "Detected by: nikto, nuclei. nikto: OpenSSL/1.1.1j vulnerable. nuclei: [CVE-2021-3449] [high] confirmed", "poc_steps": ["1. Run nikto against target on port 443", "2. Run nuclei with CVE-2021-3449 template", "3. Both tools confirm vulnerability in OpenSSL 1.1.1j"], "poc_payload": "nuclei -u https://10.0.0.1 -t CVE-2021-3449.yaml", "tool_source": "nikto,nuclei"}],
       "endpoints": [{"url": "/admin", "method": "GET", "status_code": 200, "content_length": 5432, "redirect": ""}],
       "http_headers": [{"host": "target.com", "port": 80, "missing_security_headers": ["X-Frame-Options", "Content-Security-Policy", "Strict-Transport-Security"], "server": "nginx/1.18.0", "x_powered_by": ""}],
       "ports": [], "services": [], "technologies": [], "ssl_certs": [], "credentials": [], "whois": {}, "dns_records": []
@@ -207,7 +219,7 @@ Output ONLY valid JSON — no markdown fences, no commentary.
     "hosts": [{"ip": "10.0.0.1", "hostname": "target.com", "os": "Linux 5.4", "ports": [{"port": 80, "protocol": "tcp", "state": "open", "service": "http", "version": "nginx/1.18.0", "banner": "", "nse_scripts": {"http-title": "Login Page"}}]}],
     "ports": [],
     "services": [],
-    "vulnerabilities": [{"id": "CVE-2021-3449", "title": "OpenSSL DoS", "severity": "high", "cvss": 7.5, "host": "10.0.0.1", "port": 443, "description": "OpenSSL denial of service. Confirmed by multiple tools.", "references": [], "evidence": "Detected by: nikto, nuclei", "tool_source": "nikto,nuclei"}],
+    "vulnerabilities": [{"id": "CVE-2021-3449", "title": "OpenSSL DoS", "severity": "high", "cvss": 7.5, "host": "10.0.0.1", "port": 443, "description": "OpenSSL denial of service. Confirmed by multiple tools.", "references": [], "evidence": "Detected by: nikto, nuclei. nikto: OpenSSL/1.1.1j vulnerable. nuclei: [CVE-2021-3449] [high] confirmed", "poc_steps": ["1. Run nikto against target on port 443", "2. Run nuclei with CVE-2021-3449 template", "3. Both tools confirm vulnerability in OpenSSL 1.1.1j"], "poc_payload": "nuclei -u https://10.0.0.1 -t CVE-2021-3449.yaml", "tool_source": "nikto,nuclei"}],
     "endpoints": [{"url": "/admin", "method": "GET", "status_code": 200, "content_length": 5432, "redirect": ""}],
     "subdomains": ["mail.target.com", "dev.target.com"],
     "technologies": [],

diff --git a/blhackbox/prompts/claude_playbook.md b/blhackbox/prompts/claude_playbook.md
@@ -68,10 +68,11 @@ Append every raw output to the same `raw_outputs` dict.
 
 ---
 
-## Phase 3 -- Enumeration
+## Phase 3 -- Enumeration & Exploitation
 
 **Objective:** Deep-dive into web services, directories, technologies, and
-application-layer weaknesses.
+application-layer weaknesses. Validate every finding with a concrete Proof of
+Concept (PoC).
 
 | Task |
 |------|
@@ -84,6 +85,12 @@ application-layer weaknesses.
 | Exploit validation |
 | Credential extraction from traffic |
 | Web application reconnaissance |
+| **PoC development for every confirmed finding** |
+| **Screenshot evidence capture for visual proof** |
+
+For every vulnerability or finding discovered, you **MUST** produce a PoC before
+moving to Phase 4. A finding without a PoC is not a valid finding. See the
+[PoC Requirements](#poc-requirements) section below.
 
 Append every raw output to `raw_outputs`.
 
@@ -105,6 +112,8 @@ Append every raw output to `raw_outputs`.
    - **Deduplicate** findings across tools (same CVE from nikto + nuclei → one entry)
    - **Correlate** cross-tool evidence (nmap version + nikto CVE → higher confidence)
    - **Assess severity** using pentesting rules (RCE = critical, XSS = medium, etc.)
+   - **Attach PoC data** to every vulnerability — populate `evidence`,
+     `poc_steps`, and `poc_payload` fields (see [PoC Requirements](#poc-requirements))
    - **Extract errors** (timeouts, WAF blocks, rate limits) into `error_log`
      with `security_relevance` ratings
    - **Generate executive summary** with risk level, top findings, and attack chains
@@ -153,10 +162,19 @@ For each finding include:
 - Title / CVE (if available)
 - Affected host(s) and port(s)
 - CVSS score (if available)
-- Description of the vulnerability
-- Evidence / proof of concept
+- Description of the vulnerability (root cause, not just the symptom)
+- **Proof of Concept (MANDATORY)** — see [PoC Requirements](#poc-requirements)
+  - Numbered steps to reproduce
+  - Exact command, payload, or request used
+  - Tool output or HTTP response proving exploitation
+  - Screenshot evidence (where applicable)
+  - Impact demonstration (what the attacker gained)
 - References
 
+> **A finding without a PoC is not a valid finding.** If you cannot produce a
+> reproducible PoC, downgrade the finding to "info" severity and note that
+> exploitation could not be confirmed.
+
 ### 4. Anomalies & Scan Artifacts
 
 Pull entries from `payload.error_log` where `security_relevance` is `medium` or
@@ -191,6 +209,63 @@ Provide prioritized, actionable remediation guidance:
 
 ---
 
+## PoC Requirements
+
+**Every vulnerability and finding MUST include a Proof of Concept (PoC).** A
+report with findings that only describe a vulnerability without demonstrating
+it is not valid. An administrator who was not present during the test must be
+able to independently reproduce and confirm each finding using only the PoC.
+
+### Required PoC Elements
+
+For **every** finding (critical through low severity), provide:
+
+| Element | Description |
+|---------|-------------|
+| **Reproduction steps** | Numbered, chronological steps to replicate the finding |
+| **Exact command/payload** | Copy-pasteable tool commands, HTTP requests, or exploit payloads |
+| **Raw output/response** | Terminal output, HTTP response body, or tool output proving the exploit worked |
+| **Impact demonstration** | What the attacker gained — not theoretical, but shown (e.g., data returned, shell obtained, privilege escalated) |
+| **Screenshot evidence** | Visual proof via `take_screenshot` / `take_element_screenshot` where applicable |
+
+### PoC by Vulnerability Class
+
+| Vulnerability Class | Minimum PoC Requirement |
+|---------------------|-------------------------|
+| SQL Injection | Injection payload, DBMS response, extracted sample data (max 5 rows) |
+| XSS (Reflected/Stored) | Payload, reflected/stored output in response body, screenshot of rendered payload |
+| RCE / Command Injection | Payload, command output (e.g., `id`, `whoami`), proof of execution |
+| LFI / Path Traversal | Traversal payload, file contents returned (e.g., `/etc/passwd`) |
+| SSRF | Request to internal endpoint, response proving internal access |
+| Authentication Bypass | Steps showing unauthenticated access to protected resource |
+| IDOR | Two requests showing access to another user's data via ID manipulation |
+| Default/Weak Credentials | Service, username:password pair, screenshot of authenticated session |
+| Missing Security Headers | HTTP response headers dump, list of missing headers with risk explanation |
+| SSL/TLS Issues | SSL scan output showing weak ciphers, expired certs, or outdated protocols |
+| Information Disclosure | Exact endpoint and response body containing sensitive data |
+
+### Storing PoC Data in AggregatedPayload
+
+When building the `AggregatedPayload`, populate these `VulnerabilityEntry` fields:
+
+- `evidence`: Raw tool output, HTTP response, or terminal output proving the finding
+- `poc_steps`: Ordered list of reproduction steps (e.g., `["1. Navigate to /login", "2. Enter payload ' OR 1=1-- in username field", "3. Observe 302 redirect to /admin"]`)
+- `poc_payload`: The exact payload, command, or request used (e.g., `"sqlmap -u 'http://target/page?id=1' --dbs --batch"` or the raw HTTP request)
+
+### PoC Validation Checklist
+
+Before including a finding in the report, verify:
+
+- [ ] Can someone reproduce this with only the PoC steps provided?
+- [ ] Is the exact payload/command included and copy-pasteable?
+- [ ] Does the evidence (output/response) clearly prove the vulnerability exists?
+- [ ] Is the impact demonstrated, not just described?
+- [ ] Are screenshots captured for visual findings (XSS, exposed panels, error pages)?
+
+If any check fails, the PoC is incomplete — go back and gather the missing evidence.
+
+---
+
 ## Notes
 
 - If any tool call fails, log the error and continue with remaining tools.