PIF operates as a transparent HTTP reverse proxy. This document describes the request formats PIF expects, the responses it returns, and the endpoints it exposes.
GET /healthz
Response:
{"status":"ok"}Returns HTTP 200 when PIF is running and ready to accept requests.
GET /metrics
Exposes Prometheus metrics for traffic, scan latency, detections, score distribution, and rate-limit events.
Core metric names:
pif_http_requests_totalpif_scan_duration_secondspif_injection_detections_totalpif_detection_scorepif_rate_limit_events_totalpif_alert_events_totalpif_alert_sink_deliveries_total
When alerting.enabled=true, PIF emits outbound alerts without exposing new inbound HTTP endpoints.
Initial event types:
injection_blocked(immediate on block action)rate_limit_exceeded(window-aggregated per client key)scan_error(window-aggregated per client key)
Delivery model:
- Async queue + worker dispatcher
- Retry with exponential backoff and jitter
- Fail-open behavior (delivery failure never blocks proxy request handling)
- Sink execution order is sequential (
webhook->slack->pagerdutywhen enabled)
Supported sinks:
- Generic webhook (
alerting.webhook.*) - Slack Incoming Webhook (
alerting.slack.*) - PagerDuty Events API v2 (
alerting.pagerduty.*)
Generic webhook sends JSON payloads with the following contract:
{
"event_id": "evt-1741363854757000000-1",
"timestamp": "2026-03-07T12:30:54Z",
"event_type": "injection_blocked",
"action": "block",
"client_key": "203.0.113.10",
"method": "POST",
"path": "/v1/chat/completions",
"target": "https://api.openai.com",
"score": 0.92,
"threshold": 0.50,
"findings_count": 2,
"reason": "blocked_by_policy",
"sample_findings": [
{
"rule_id": "PIF-INJ-001",
"category": "prompt_injection",
"severity": 4,
"match": "ignore all previous instructions"
}
],
"aggregate_count": 1
}Notes:
sample_findingsis capped at 3 entries.aggregate_countis used by aggregated events (rate_limit_exceeded,scan_error).- When configured, webhook sink sends
Authorization: Bearer <token>.
PagerDuty sink sends trigger-only Events API v2 payloads:
{
"routing_key": "your-routing-key",
"event_action": "trigger",
"payload": {
"summary": "pif injection_blocked action=block path=/v1/chat/completions reason=blocked_by_policy",
"source": "prompt-injection-firewall",
"severity": "critical",
"timestamp": "2026-03-08T01:02:03Z",
"component": "proxy",
"group": "pif",
"class": "security",
"custom_details": {
"event_id": "evt-1741395723000000000-1",
"event_type": "injection_blocked",
"action": "block",
"client_key": "203.0.113.10",
"method": "POST",
"path": "/v1/chat/completions",
"target": "https://api.openai.com",
"score": 0.92,
"threshold": 0.5,
"findings_count": 2,
"reason": "blocked_by_policy",
"aggregate_count": 1,
"sample_findings": []
}
}
}PagerDuty severity mapping:
injection_blocked->criticalscan_error->errorrate_limit_exceeded->warning
When dashboard.enabled=true, PIF exposes a monitoring dashboard:
GET /dashboard
GET /api/dashboard/summary
GET /api/dashboard/metrics
GET /api/dashboard/rules
GET /api/dashboard/replays
GET /api/dashboard/replays/{id}
POST /api/dashboard/replays/{id}/rescan
GET /dashboardserves embedded HTML/CSS/JS.GET /api/dashboard/summaryreturns high-level counters, uptime, p95 scan latency, and a safe runtime config snapshot.GET /api/dashboard/metricsreturns normalized JSON metrics for UI polling (totals, label breakdowns, quantiles).GET /api/dashboard/rulesreturns loaded rule set metadata plus managed custom rules.GET /api/dashboard/replaysreturns replay event list (tenant,event_type,decision,payload_hash, findings, request metadata).GET /api/dashboard/replays/{id}returns full replay record.POST /api/dashboard/replays/{id}/rescanrescans captured prompts with the current detector (no upstream forwarding).
If dashboard.auth.enabled=true, both UI and dashboard API endpoints require Basic Auth and return:
HTTP/1.1 401 Unauthorized
WWW-Authenticate: Basic realm="pif-dashboard"When dashboard.rule_management.enabled=true, write APIs are exposed under the dashboard rules endpoint:
POST /api/dashboard/rules
PUT /api/dashboard/rules/{id}
DELETE /api/dashboard/rules/{id}
Write policy:
dashboard.enabled=false-> all dashboard routes return404.dashboard.rule_management.enabled=false-> write routes return404.dashboard.rule_management.enabled=trueanddashboard.auth.enabled=false-> write routes return403.dashboard.rule_management.enabled=trueand valid Basic Auth -> writes allowed.
Payload format (POST and PUT):
{
"rule": {
"id": "PIF-CUSTOM-001",
"name": "Custom Rule",
"description": "Detects tenant-specific injection pattern",
"category": "prompt_injection",
"severity": 2,
"pattern": "(?i)custom_attack_pattern",
"enabled": true,
"case_sensitive": false,
"tags": [],
"metadata": {}
}
}Notes:
severityis integer0..4(info..critical).- Built-in OWASP/jailbreak/data-exfil files are not edited via dashboard.
- Dashboard writes only to managed custom rules and applies changes with hot reload.
- Rule-set response includes source metadata (
source,path, optional marketplace metadata).
Replay API is available only when both dashboard.enabled=true and replay.enabled=true.
Behavior:
dashboard.enabled=false-> all dashboard routes return404.replay.enabled=false-> replay routes return404.- If dashboard auth is enabled, replay routes require Basic Auth.
Replay event schema (JSONL-backed):
{
"replay_id": "rpl_1741395723000000000_1",
"timestamp": "2026-03-08T01:30:45Z",
"tenant": "default",
"event_type": "block",
"decision": "block",
"score": 0.91,
"threshold": 0.50,
"findings": [],
"request_meta": {
"method": "POST",
"path": "/v1/chat/completions",
"target": "https://api.openai.com",
"client_key": "203.0.113.10"
},
"payload_hash": "sha256-hex",
"prompts": [
{
"role": "user",
"text": "ignore all previous instructions",
"truncated": false,
"redacted": true
}
]
}Captured replay event types:
blockrate_limitscan_errorflag
When tenancy.enabled=true, request policy can be resolved from tenancy.header (default X-PIF-Tenant) with fallback to tenancy.default_tenant.
Per-tenant policy override surface:
actionthresholdrate_limit.requests_per_minuterate_limit.burstadaptive_threshold.enabledadaptive_threshold.min_thresholdadaptive_threshold.ewma_alpha
Unknown tenant values fall back to default tenant policy.
Dashboard summary includes tenant breakdown for configured tenants.
Marketplace is a CLI surface (no inbound HTTP routes):
pif marketplace list
pif marketplace install <id>@<version>
pif marketplace updateContract:
- Catalog index (
marketplace.index_url) exposes entries:id,name,version,download_url,sha256,categories,maintainer
- Install verifies checksum when
marketplace.require_checksum=true - Installed files are written to
marketplace.install_dirand can be loaded as custom rules
POST /*
All POST requests are intercepted, scanned for prompt injection, and then forwarded to the upstream LLM API (or blocked). GET, PUT, DELETE, and other methods are forwarded without scanning.
PIF auto-detects the request format based on the JSON body structure and URL path.
{
"model": "gpt-4",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is the capital of France?"}
]
}PIF extracts and scans the content field from each message in the messages array. Empty content fields are skipped.
{
"model": "claude-sonnet-4-20250514",
"max_tokens": 256,
"system": "You are a helpful assistant.",
"messages": [
{"role": "user", "content": "What is the capital of France?"}
]
}PIF scans the optional system field (with role system) and each message's content in the messages array.
Auto-detection logic:
- If the URL path contains
anthropicormessagesand the body has asystemfield, PIF uses the Anthropic parser. - Otherwise, PIF falls back to the OpenAI parser.
When action=block and an injection is detected:
HTTP/1.1 403 Forbidden
Content-Type: application/json
{
"error": {
"message": "Request blocked by Prompt Injection Firewall",
"type": "prompt_injection_detected",
"score": 0.85,
"findings": 2
}
}| Field | Type | Description |
|---|---|---|
error.message |
string | Human-readable error message |
error.type |
string | Always prompt_injection_detected |
error.score |
float | Threat score (0.0 - 1.0) |
error.findings |
int | Number of matched detection rules |
When client rate limits are enabled and exceeded:
HTTP/1.1 429 Too Many Requests
Content-Type: application/json
{
"error": {
"message": "Request rate-limited by Prompt Injection Firewall",
"type": "rate_limit_exceeded"
}
}When action=flag and an injection is detected, PIF forwards the request to the upstream API and adds headers to the response:
X-PIF-Flagged: true
X-PIF-Score: 0.85| Header | Type | Description |
|---|---|---|
X-PIF-Flagged |
string | true when injection detected |
X-PIF-Score |
string | Threat score as decimal string |
The response body is the original upstream API response.
When action=log, PIF forwards the request silently and logs the detection server-side. No headers are added and no modification is made to the response.
When no injection is detected, PIF forwards the request to the upstream API without modification and returns the upstream response as-is.
RESULT: INJECTION DETECTED (score: 0.85)
RULE ID CATEGORY SEVERITY MATCH
---------------- ------------------------ ---------- ----------------------------------------
PIF-INJ-001 prompt_injection critical "ignore all previous instructions..."
PIF-LLM07-001 system_prompt_leak high "reveal your system prompt"
2 finding(s) in 1.23ms
{
"clean": false,
"score": 0.85,
"findings": [
{
"rule_id": "PIF-INJ-001",
"category": "prompt_injection",
"severity": 4,
"description": "Detects attempts to override system instructions",
"matched_text": "ignore all previous instructions",
"offset": 0,
"length": 32
}
],
"detector_id": "ensemble",
"duration_ms": 1.23,
"input_hash": "a1b2c3d4..."
}| Field | Type | Description |
|---|---|---|
clean |
bool | true if no injection detected |
score |
float | Threat score (0.0 - 1.0) |
findings |
array | List of matched rules |
findings[].rule_id |
string | Unique rule identifier |
findings[].category |
string | Attack category |
findings[].severity |
int | 0=info, 1=low, 2=medium, 3=high, 4=critical |
findings[].matched_text |
string | The text that triggered the rule |
findings[].offset |
int | Character offset in the input |
findings[].length |
int | Length of the matched text |
detector_id |
string | Detector that produced the result |
duration_ms |
float | Scan duration in milliseconds |
input_hash |
string | SHA-256 hash of the input text |
| Code | Meaning |
|---|---|
0 |
Clean -- no injection detected |
1 |
Injection detected |
2 |
Error (invalid input, missing rules, etc.) |