Agent Intelligence: RAG-Enhanced Pentesting Knowledge Base

## Description
Add a local, vector-indexed pentesting knowledge base that the agent queries before falling back to web search. Curated content from ExploitDB, NVD, OWASP Testing Guide, and tool documentation — faster and more reliable than Tavily for known techniques.

## Why this matters
The agent currently relies on Tavily web search for all external knowledge. This has three problems:

1. **Latency**: web search takes 2-5 seconds per query. During a 50-iteration engagement, the agent may search 15-20 times. A local vector DB returns results in <100ms.
2. **Reliability**: web search returns blog posts, Stack Overflow answers, and marketing pages mixed with actual exploit documentation. The agent wastes reasoning tokens parsing irrelevant results. A curated KB returns only verified, actionable content.
3. **Knowledge gaps on recent or niche CVEs**: the LLM's training data has a cutoff. For CVEs published after the cutoff, the agent must web search — but search results for fresh CVEs are often incomplete or contradictory. A regularly updated local KB with NVD data fills this gap.
4. **Tool documentation accuracy**: the agent frequently uses wrong Metasploit module syntax, incorrect sqlmap flags, or invalid Hydra protocol strings. Embedding exact tool documentation (module options, syntax, examples) in a retrievable KB eliminates these errors.

## Architecture
```
Agent needs knowledge → Query local KB first (fast, curated)
                          ↓ sufficient?
                        YES → use it
                        NO → fall back to Tavily web search
                          ↓
                        Merge and deduplicate results
```

## Proposed knowledge sources
| Source | Content | Update frequency |
|--------|---------|-----------------|
| NVD/CVE database | CVE descriptions, CVSS scores, affected products | Daily (via NVD API) |
| ExploitDB | Exploit code, proof-of-concepts, vulnerability details | Weekly |
| OWASP Testing Guide | Web app testing methodology and techniques | Static (versioned) |
| Metasploit module docs | Module options, targets, payloads, examples | On image build |
| Nuclei template metadata | Template IDs, severity, tags, CVE mappings | On image build |
| Tool manuals | sqlmap, Hydra, nmap flags and usage patterns | Static |
| GTFOBins/LOLBAS | Privilege escalation one-liners per binary | Monthly |

## What already exists
- Tavily web search integration (`tools.py:402-481`)
- Neo4j for graph data (could double as document store)
- MITRE CWE/CAPEC enrichment system with caching
- NVD API key support in `.env.example`

## What needs to be built
- [ ] Vector store setup (FAISS, ChromaDB, or Qdrant — FAISS is simplest, no extra container)
- [ ] Document ingestion pipeline for each knowledge source
- [ ] Embedding model selection (sentence-transformers or OpenAI embeddings)
- [ ] `PentestKnowledgeBase` class with query + adaptive_retrieve methods
- [ ] Integration with existing Tavily: local-first, web-fallback pattern
- [ ] Update pipeline (scheduled re-ingestion for NVD, ExploitDB)
- [ ] Docker volume for persistent vector index
- [ ] Tool documentation extraction script (parse Metasploit module info, sqlmap --help, etc.)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Agent Intelligence: RAG-Enhanced Pentesting Knowledge Base #53

Description

Why this matters

Architecture

Proposed knowledge sources

What already exists

What needs to be built

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Source	Content	Update frequency
NVD/CVE database	CVE descriptions, CVSS scores, affected products	Daily (via NVD API)
ExploitDB	Exploit code, proof-of-concepts, vulnerability details	Weekly
OWASP Testing Guide	Web app testing methodology and techniques	Static (versioned)
Metasploit module docs	Module options, targets, payloads, examples	On image build
Nuclei template metadata	Template IDs, severity, tags, CVE mappings	On image build
Tool manuals	sqlmap, Hydra, nmap flags and usage patterns	Static
GTFOBins/LOLBAS	Privilege escalation one-liners per binary	Monthly

Agent Intelligence: RAG-Enhanced Pentesting Knowledge Base #53

Description

Description

Why this matters

Architecture

Proposed knowledge sources

What already exists

What needs to be built

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions