LLMMap is an automated prompt injection testing framework inspired by sqlmap. It operates in a stateless fire-and-judge loop: each prompt is injected independently into one or more discovered injection points, the target response is diffed against a clean baseline, and an LLM judge scores whether the attacker-specified goal was achieved. Confirmed findings are emitted only after passing a reliability gate based on repeated independent probes.
llmmap/
__init__.py
cli.py # argparse-based argument parser and entry point
config.py # RuntimeConfig dataclass (all scan settings)
core/
__init__.py
scanner.py # Top-level scan pipeline (run_scan)
orchestrator.py # Main probe loop and finding confirmation
http_client.py # HTTP execution engine (urllib.request)
request_loader.py # Burp Suite XML and URL-based request loading
injection_points.py # Injection point discovery (Q/B/H/C/P)
request_mutator.py # Prompt injection into request copies
goal_judge.py # LLM-based goal achievement scoring
prompt_generator.py # LLM prompt generation from style templates
reliability.py # Retry logic and Wilson confidence interval
models.py # Evidence, Finding, ScanReport data types
ui.py # Console output (banner, markers, findings block)
run.py # Run workspace creation and metadata
audit.py # Audit trail logging
sensitive.py # Sensitive data detection in responses
pattern_detection.py # Response pattern matching heuristics
dataflow.py # Internal data-flow tracing
pivot.py # Pivot logic between injection points
conversation.py # Multi-turn conversation state management
oob.py # Out-of-band callback handling (reserved)
tap.py # Tree of Attacks with Pruning (reserved)
tap_roles.py # TAP attacker/judge role definitions (reserved)
tap_scoring.py # TAP branch scoring heuristics (reserved)
detectors/
__init__.py
hub.py # Detector consensus: heuristic + pattern + semantic
base.py # Abstract detector base class
judge.py # LLM judge detector implementation
semantic.py # Semantic similarity detector
llm/
__init__.py
client.py # Unified LLM client interface
providers.py # Ollama / OpenAI / Anthropic / Google adapters
payloads/
__init__.py
loader.py # YAML prompt pack loader
selector.py # Family-based depth selection by intensity
render.py # Template placeholder rendering
obfuscations.py # Base64 / homoglyph / leet / language-switch
schema.py # Prompt entry schema and validation
packs/ # Bundled YAML prompt packs (4 files)
modules/
__init__.py
mutation.py # Prompt mutation strategies
reporting/
__init__.py
writer.py # Scan report output (JSON, console)
utils/
__init__.py
logging.py # Thread-safe logging with ANSI colour formatting
-
Parse CLI and build config. The CLI in
cli.pyparses arguments and constructs aRuntimeConfigdataclass that carries every scan setting through the pipeline. -
Create run workspace. A unique
runs/<run_id>/directory is created and ametadata.jsonfile is written with scan parameters, timestamps, and target information. -
Load request. The target request is loaded from either a Burp Suite XML export (
-rflag) or a plain URL. The*marker in the request body, headers, or query string identifies injection locations. -
Fire baseline request. A clean (uninjected) request is sent to the target. The response is stored as the diff reference for all subsequent probes.
-
Check LLM backend connectivity. A lightweight probe confirms that the configured LLM provider is reachable before starting the scan.
-
Load and select prompts. All 227 techniques are loaded from four YAML packs. The selector applies family-based depth selection controlled by the
--intensitylevel, choosing N techniques per family. -
Pre-generate prompts. The LLM Generator renders every selected technique into a concrete prompt string using
ThreadPoolExecutor. Eachstyle_templateis populated with the user-supplied--goaland placeholder tokens ({{RUN_ID}},{{GOAL_PROMPT}},{{CANARY_TOKEN}},{{CANARY_URL}}). -
Probe loop. For each prompt and injection point combination:
- a. Inject the prompt into a copy of the base request.
- b. Fire the mutated request at the target and capture the response.
- c. Diff the response body against the stored baseline.
- d. The LLM Judge scores whether the response indicates goal achievement.
- e. If the score exceeds the detection threshold: execute up to 5 reliability retries, requiring 3 or more confirmations (Wilson confidence interval) to promote the candidate to a confirmed finding.
- f. If confirmed: emit the finding with prompt, technique, confidence, and evidence.
-
Report findings. Confirmed findings are printed in sqlmap-style output with parameter name, injection type, technique title, prompt text, and confidence score.
Both roles use the same LLM client configured by --provider and --model. Each call is stateless; no conversation history is carried between probes.
| Role | Purpose |
|---|---|
| Generator | Crafts a concrete prompt from a technique's style_template and the user-supplied --goal. |
| Judge | Evaluates whether the target's response satisfies the goal, returning a numeric score. |
Four YAML packs are bundled under llmmap/payloads/packs/, totalling 227 techniques across 18 attack families:
| Pack | Techniques | Focus |
|---|---|---|
pack_a_baseline.yaml |
14 | Direct prompt injection fundamentals (LLM01) |
pack_b_extended.yaml |
13 | Jailbreaks, role confusion, delimiter abuse |
pack_c_master_checklist.yaml |
109 | Broad OWASP LLM Top 10 coverage |
pack_d_full_coverage.yaml |
91 | Full-spectrum technique coverage |
Each entry defines: family, technique, template, style_template, risk, tags, stage, and obfuscations.
The DetectorHub aggregates multiple detection signals into a single confidence score:
- LLM Judge -- The primary signal. The judge detector prompts the LLM to evaluate whether the target response achieved the stated goal.
- Pattern matching -- Heuristic rules that look for known indicators in the response diff (e.g., canary tokens, data leakage markers).
- Semantic similarity -- Optional detector that measures embedding-level overlap between the response and expected goal output.
The hub computes a weighted consensus score. Candidates exceeding the --detector-threshold (default 0.6) proceed to reliability confirmation.
Every candidate finding is re-validated with --reliability-retries (default 5) independent probes. A finding is confirmed only when --confirm-threshold (default 3) of those retries also score above threshold. Confirmation statistics use the Wilson confidence interval to produce a lower-bound confidence estimate. This prevents hallucinated judge verdicts and transient server behaviour from surfacing as reported findings.
The llmmap/llm/ package exposes a unified LLMClient with adapters for four backends:
| Provider | Authentication | Default Model |
|---|---|---|
ollama |
None (local) | dolphin3:8b |
openai |
OPENAI_API_KEY |
gpt-4o-mini |
anthropic |
ANTHROPIC_API_KEY |
claude-sonnet-4-20250514 |
google |
GOOGLE_API_KEY |
gemini-2.0-flash |
All HTTP calls -- both to LLM providers and to the scan target -- use Python's stdlib urllib.request. There are no third-party HTTP dependencies.
| Setting | Default Value |
|---|---|
| Intensity | 1 |
| Detection threshold | 0.6 |
| Reliability retries | 5 |
| Confirm threshold | 3 |
| Max prompts | 25 |
| Thread workers | 1 |
| HTTP timeout | 10 s |
| Injection point types | Q, B, H, C, P |
| Safe mode | On |
The following modules are present in the codebase but are reserved for future versions and are not active in the v1.0.0 scan pipeline:
- TAP (Tree of Attacks with Pruning) --
tap.py,tap_roles.py,tap_scoring.py. Multi-turn adaptive attack strategy where the LLM iteratively refines prompts based on prior responses. Planned for a future release. - OOB (Out-of-Band) --
oob.py. Callback-based detection for blind injection scenarios where the target response does not directly reveal success. Planned for a future release.