π‘οΈ Real-time PII redaction proxy for MCP clients and servers β zero-latency privacy for Python 3.12+, with optional Python 3.14 subinterpreter acceleration.
mcp-shield-pii is an intercepting gateway proxy that sits between your MCP client (e.g., Claude Desktop) and any downstream MCP server. It detects and masks Personally Identifiable Information in real-time before it reaches the LLM's context window, ensuring GDPR/HIPAA compliance with a single pip install.
When an AI agent requests data from an MCP server, the raw payload β potentially containing SSNs, medical records, or credit cards β flows directly into the LLM. Organizations face potential GDPR/HIPAA fines exceeding hundreds of millions of dollars. mcp-shield-pii eliminates this risk at the protocol layer.
ββββββββββββββββ βββββββββββββββββββ ββββββββββββββββββββ
β Claude ββββββΆβ mcp-shield-pii ββββββΆβ Downstream MCP β
β Desktop βββββββ (PII Redaction) βββββββ Server β
ββββββββββββββββ βββββββββββββββββββ ββββββββββββββββββββ
β²
PII masked before
reaching the LLM
pip install mcp-shield-piiFor NLP-based detection (names, organizations, addresses):
pip install mcp-shield-pii[nlp]
python -m spacy download en_core_web_sm# Simple scan
mcp-shield-pii scan "Contact john@example.com, SSN 123-45-6789"
# JSON output
mcp-shield-pii scan --json "Patient MRN-123456 at 192.168.1.1"
# Different masking strategies
mcp-shield-pii scan --strategy partial "Card: 4111-1111-1111-1111"
mcp-shield-pii scan --strategy hash "Email: secret@corp.com"
mcp-shield-pii scan --strategy pseudo "Call 555-123-4567"# Basic proxy (stdio transport)
mcp-shield-pii proxy --downstream "npx -y @modelcontextprotocol/server-postgres postgresql://localhost/mydb"
# With config file
mcp-shield-pii proxy --downstream "python my_server.py" --config shield.toml
# Dry-run mode (log detections, don't modify payloads)
mcp-shield-pii proxy --downstream "npx my-mcp-server" --dry-runAdd to your claude_desktop_config.json:
{
"mcpServers": {
"my-server-shielded": {
"command": "mcp-shield-pii",
"args": [
"proxy",
"--downstream", "npx -y @modelcontextprotocol/server-postgres postgresql://localhost/mydb",
"--config", "/path/to/shield.toml"
]
}
}
}mcp-shield-pii generate-config --output shield.tomlmcp-shield-pii report --format markdown --output compliance_report.mdmcp-shield-pii dashboard --port 8765
# Open http://127.0.0.1:8765| Feature | Description |
|---|---|
| Stdio Proxy | Intercepts MCP stdio transport between client and downstream server |
| Regex Engine (18 types) | Detects SSNs, credit cards, emails, phones, IBANs, API keys, JWTs, and more |
| NLP Engine | Optional spaCy NER for person names, organizations, locations, addresses |
| Masking Strategies | redact (<REDACTED>), partial (***-**-6789), hash (SHA256:a1b2...), pseudo (consistent fakes) |
| TOML Configuration | Per-entity rules, per-tool allow/deny lists, confidence thresholds |
| CallToolResult Interception | Targets JSON-RPC responses while passing non-sensitive RPCs through |
| Audit Trail | JSONL audit log with timestamps, entity types, confidence scores |
| CLI | proxy, scan, report, dashboard, generate-config, version |
| Feature | Description |
|---|---|
| Context-Aware Scoring | Reduces false positives by analyzing surrounding text |
| Confidence Thresholds | Per-entity-type configurable minimum confidence |
| Tool Allow/Deny Lists | Skip trusted tools, enforce strict mode on sensitive ones |
| Dry-Run Mode | Log what would be redacted without modifying payloads |
| Hot-Reload Config | Change rules without restarting the proxy |
| Prometheus Metrics | /metrics endpoint with latency percentiles and entity counters |
| Feature | Description |
|---|---|
| Pseudo-Anonymization | Consistent fake-data mapping preserving semantic meaning |
| Reversible Redaction | AES-256 encrypted mapping β authorized key-holders can restore originals |
| Compliance Dashboard | Dark-mode web UI with real-time event table and severity badges |
| GDPR/HIPAA Reports | Auto-generated compliance reports (text, JSON, markdown) |
| Webhook Alerts | Notify Slack/Teams when high-severity PII is detected |
| Subinterpreter Pool | GIL-free parallel detection via concurrent.interpreters (3.14+) or ProcessPoolExecutor (3.12+) |
| Entity | Example | Validation |
|---|---|---|
user@example.com |
Regex | |
| Phone | +1-555-123-4567 |
Regex |
| SSN | 123-45-6789 |
Regex + format validation |
| Credit Card | 4111-1111-1111-1111 |
Regex + Luhn checksum |
| IBAN | DE89370400440532013000 |
Regex + country-code length |
| IPv4 | 192.168.1.1 |
Regex |
| IPv6 | 2001:0db8::1 |
Regex |
| MAC Address | 00:1A:2B:3C:4D:5E |
Regex |
| AWS API Key | AKIA... |
Regex (prefix) |
| OpenAI Key | sk-... |
Regex (prefix) |
| Stripe Key | sk_live_... |
Regex (prefix) |
| GitHub Token | ghp_... |
Regex (prefix) |
| Passport | A12345678 |
Regex |
| Date of Birth | 1990-01-15 |
Regex |
| Medical ID | MRN-123456 |
Regex |
| Driver's License | D123-4567-8901 |
Regex |
| URL with Auth | https://user:pass@host |
Regex |
| JWT Token | eyJhbG... |
Regex (prefix) |
| Entity | Example |
|---|---|
| Person Name | John Smith |
| Organization | Acme Corp |
| Address | 123 Main St, Springfield |
| Location | New York City |
| Medical Condition | Type 2 diabetes |
[shield]
default_masking_strategy = "redact"
default_confidence_threshold = 0.7
dry_run = false
[detection]
enable_regex = true
enable_nlp = false
enable_context_scoring = true
[entities.SSN]
masking_strategy = "redact"
confidence_threshold = 0.8
[entities.EMAIL]
masking_strategy = "pseudo"
confidence_threshold = 0.7
[tools.trusted_internal_tool]
action = "skip"
[tools.patient_records_api]
action = "strict"
masking_strategy = "redact"
[[webhooks]]
url = "https://hooks.slack.com/services/YOUR/WEBHOOK"
events = ["high_severity"]
[dashboard]
enabled = true
port = 8765
[metrics]
enabled = true
port = 9090from mcp_shield_pii.detection.regex_engine import RegexDetectionEngine
from mcp_shield_pii.masking.strategies import get_strategy
from mcp_shield_pii.pipeline import ShieldPipeline
from mcp_shield_pii.config.loader import ShieldConfig
# Simple detection
engine = RegexDetectionEngine()
results = engine.detect("Email john@corp.com, SSN 123-45-6789")
for r in results:
print(f"{r.entity_type.value}: '{r.text}' (confidence: {r.confidence:.0%})")
# Full pipeline
config = ShieldConfig(default_masking_strategy="partial")
pipeline = ShieldPipeline(config)
masked, summary = pipeline.process_text("Contact admin@secret.org, card 4111-1111-1111-1111")
print(masked) # "Contact a***@***.org, card ****-****-****-1111"
pipeline.close()
# Pseudo-anonymization
config = ShieldConfig(default_masking_strategy="pseudo")
pipeline = ShieldPipeline(config)
masked, _ = pipeline.process_text("Email alice@corp.com then alice@corp.com again")
print(masked) # Same fake email both times (consistent mapping)
pipeline.close()src/mcp_shield_pii/
βββ __init__.py # Public API exports
βββ cli.py # Typer CLI (6 commands)
βββ pipeline.py # Orchestration: detect β score β filter β mask β audit
βββ compliance.py # GDPR/HIPAA report generator
βββ webhooks.py # Async webhook alerts
βββ detection/
β βββ base.py # EntityType enum, DetectionResult, protocols
β βββ regex_engine.py # 18 regex patterns + Luhn/IBAN validation
β βββ nlp_engine.py # spaCy NER detection (optional)
β βββ context_scorer.py # Context-aware confidence adjustment
βββ masking/
β βββ strategies.py # Redact, partial, hash, pseudo-anonymization
β βββ reversible.py # AES-256 Fernet reversible redaction
βββ config/
β βββ loader.py # TOML config parser
β βββ watcher.py # Hot-reload file watcher
βββ proxy/
β βββ __init__.py # MCP JSON-RPC interceptor
β βββ stdio_proxy.py # Bidirectional stdio transport
βββ concurrency/
β βββ __init__.py # Subinterpreter pool + ProcessPool fallback
βββ metrics/
β βββ __init__.py # Prometheus metrics + HTTP server
βββ audit/
β βββ __init__.py # JSONL audit logger
βββ dashboard/
βββ __init__.py # Web UI + REST API
See CONTRIBUTING.md
MIT β see LICENSE for details.