GovernsAI Precheck is a policy evaluation and PII redaction service that provides real-time policy decisions and data sanitization for AI tool interactions. The service supports per-tool PII access controls, allowing different tools to have specific rules for handling sensitive data.
- Payload-based policies: Policies sent by agent/client in request payload
- Tool-based policies: Different rules for different AI tools
- Scope-based policies: Network scope restrictions (e.g.,
net.external) - PII detection: Advanced PII detection using Presidio with fallback to regex
- Real-time decisions: Fast policy evaluation with sub-second response times
- Agent-side policy management: Policies fetched and managed by agent, not precheck service
- Tool-specific allowlists: Configure which PII types each tool can access
- Transform actions: Support for
pass_through,tokenize,confirm, and default redaction - Stable tokenization: HMAC-based consistent token generation
- Dynamic policy configuration: Policies sent in request payload for real-time updates
- Backward compatibility: Falls back to strict PII blocking (SSN and passwords only) when no payload provided
- Presidio integration: Advanced NLP-based PII detection
- Fallback detection: Regex-based detection when Presidio unavailable
- Multiple PII types: Email, SSN, phone numbers, credit cards, API keys, JWT tokens
- False positive filtering: Context-aware filtering to reduce false positives
- Webhook events: Real-time policy decision events with HMAC authentication
- Dead Letter Queue (DLQ): Failed webhook deliveries stored in JSONL format
- Retry logic: Exponential backoff with configurable retry attempts
- Event schema: Versioned event format for backward compatibility
- Configurable error behavior:
block,pass, orbest_effortmodes - Graceful degradation: Fallback strategies when policy evaluation fails
- HTTP status codes: Proper error responses with structured reasons
- Structured JSON logs: One line per request for easy parsing
- Provable governance: Complete audit trail without database dependency
- Log shipping ready: Compatible with Loki/Datadog ingestion
- Real-time budget tracking: Track LLM and purchase costs per user
- Cost estimation: Automatic cost estimation based on model and text length
- Budget enforcement: Block requests that exceed budget limits
- Purchase detection: Extract purchase amounts from tool metadata
- Budget warnings: Require confirmation when approaching budget limits
- Monthly budget reset: Automatic budget reset at month boundaries
- Live policy updates: Reload policies without service restart
- File change detection: Automatic reload when policy file is modified
- Global defaults: Organization-wide policy stance configuration
The precheck service now supports dynamic policy evaluation where policies are sent by the agent/client in the request payload, rather than being loaded from static YAML files. This enables real-time policy updates and user-specific policy management.
{
"tool": "send_email",
"raw_text": "Send email to john.doe@company.com with SSN 123-45-6789",
"scope": "net.external",
"corr_id": "req-12345",
"policy_config": {
"version": "v1",
"defaults": {
"ingress": {"action": "redact"},
"egress": {"action": "redact"}
},
"tool_access": {
"send_email": {
"direction": "ingress",
"action": "redact",
"allow_pii": {
"PII:email_address": "pass_through",
"PII:us_ssn": "tokenize"
}
}
},
"deny_tools": ["python.exec", "bash.exec", "code.exec", "shell.exec"],
"network_scopes": ["net."],
"network_tools": ["web.", "http.", "fetch.", "request."],
"on_error": "block"
},
"tool_config": {
"tool_name": "send_email",
"scope": "net.external",
"direction": "ingress",
"metadata": {
"category": "communication",
"risk_level": "medium"
}
}
}- Agent fetches policies from database (user/org-specific)
- Agent sends policies in request payload to precheck service
- Precheck evaluates using payload policies (no database queries)
- Precheck returns decision with transformed text
- Agent handles the response and proceeds with tool execution
- 🚀 Performance: No database queries in precheck service
- 🔄 Real-time: Policies updated instantly without service restart
- 🔒 Security: User/org-specific policies
- 📈 Scalable: Precheck service stays lightweight
- 🔄 Backward Compatible: Falls back to YAML if no payload provided
The complete API specification is available in OpenAPI 3.1.0 format:
- File:
openapi.json(included in repository) - Interactive Docs:
http://localhost:8080/api/docs(Swagger UI) - Alternative Docs:
http://localhost:8080/api/redoc(ReDoc) - Schema:
http://localhost:8080/api/openapi.json(JSON schema)
The precheck service returns one of four decision types:
allow: Tool execution is permitted without modificationdeny: Tool execution is blocked (e.g., dangerous tools, policy violations)transform: Tool execution is permitted but text is modified (e.g., PII redaction, tokenization)confirm: User confirmation is required before tool execution (e.g., sensitive operations)
All API responses include a policy_id field that indicates which precedence level was applied:
deny-exec: DENY_TOOLS level (highest priority)tool-access: TOOL_SPECIFIC leveldefaults: GLOBAL_DEFAULTS levelnet-redact-presidioornet-redact-regex: NETWORK_SCOPE levelstrict-fallback: STRICT_FALLBACK level (lowest priority)
GET /api/v1/health
Response:
{
"ok": true,
"service": "governsai-precheck",
"version": "0.1.0"
}GET /api/v1/ready
Purpose: Comprehensive readiness check for Kubernetes probes and service validation
Response:
{
"ready": true,
"service": "governsai-precheck",
"version": "0.1.0",
"checks": {
"presidio": {"status": "ok", "message": "..."},
"policy": {"status": "ok", "message": "..."},
"policy_file": {"status": "ok", "message": "..."},
"environment": {"status": "ok", "message": "..."},
"dlq": {"status": "ok", "message": "..."}
},
"timestamp": 1758812000
}GET /api/metrics
Purpose: Prometheus metrics endpoint for monitoring and alerting
Response: Prometheus text format with counters, histograms, and gauges
POST /api/v1/precheck
Purpose: Evaluate policy and sanitize raw text before tool execution
Authentication: API key passed via X-Governs-Key header (forwarded to WebSocket for authentication)
User ID Handling:
- Optional in request payload (
user_idfield) - If not provided, extracted from webhook URL path (
/ws/v1/u/{user_id}/precheck) - Used for rate limiting and audit logging
Request (with dynamic policies):
{
"tool": "verify_identity",
"scope": "net.external",
"raw_text": "User email: alice@example.com, SSN: 123-45-6789",
"corr_id": "req-123",
"user_id": "cmfzriaip0000fyp81gjfkri9",
"tags": ["urgent", "customer"],
"policy_config": {
"version": "v1",
"defaults": {
"ingress": {"action": "redact"},
"egress": {"action": "redact"}
},
"tool_access": {
"verify_identity": {
"direction": "ingress",
"allow_pii": {
"PII:email_address": "pass_through",
"PII:us_ssn": "tokenize"
}
}
},
"deny_tools": ["python.exec", "bash.exec"],
"on_error": "block"
}
}Request (legacy - falls back to YAML):
{
"tool": "verify_identity",
"scope": "net.external",
"raw_text": "User email: alice@example.com, SSN: 123-45-6789",
"corr_id": "req-123",
"user_id": "cmfzriaip0000fyp81gjfkri9",
"tags": ["urgent", "customer"]
}Response:
{
"decision": "transform",
"raw_text_out": "User email: alice@example.com, SSN: pii_8797942a",
"reasons": [
"pii.allowed:PII:email_address",
"pii.tokenized:PII:us_ssn"
],
"policy_id": "tool-access",
"ts": 1758745697
}POST /api/v1/postcheck
Purpose: Validate and sanitize raw text after tool execution (egress)
Authentication: API key passed via X-Governs-Key header (forwarded to WebSocket for authentication)
User ID Handling: Same as precheck endpoint
Request/Response: Same format as precheck
GET /api/v1/health
Response:
{
"ok": true,
"service": "governsai-precheck",
"version": "0.1.0"
}GET /api/v1/ready
Purpose: Comprehensive readiness check for Kubernetes probes and service validation
Response:
{
"ready": true,
"service": "governsai-precheck",
"version": "0.1.0",
"checks": {
"presidio": {
"status": "ok",
"message": "Presidio analyzer and anonymizer initialized"
},
"policy": {
"status": "ok",
"message": "Policy loaded with 3 sections"
},
"policy_file": {
"status": "ok",
"message": "Policy file exists: policy.tool_access.yaml"
},
"environment": {
"status": "ok",
"message": "Environment variables: {'PII_TOKEN_SALT': 'ok', 'ON_ERROR': 'ok'}"
},
"dlq": {
"status": "ok",
"message": "DLQ directory accessible: /tmp"
}
},
"timestamp": 1758812000
}Readiness Checks:
- Presidio: Analyzer and anonymizer initialization status
- Policy: Policy file parsing and validation
- Policy File: File existence and accessibility
- Environment: Critical environment variables availability
- DLQ: Dead letter queue directory accessibility
Status Values:
"ok": Check passed successfully"warning": Check passed with warnings"error": Check failed, service not ready"disabled": Check not applicable (e.g., Presidio disabled)
GET /api/metrics
Purpose: Prometheus metrics endpoint for monitoring and alerting
Response: Prometheus text format with counters, histograms, and gauges
Key Metrics:
precheck_requests_total{user_id, tool, decision, policy_id}- Total precheck requestspostcheck_requests_total{user_id, tool, decision, policy_id}- Total postcheck requestspii_detections_total{pii_type, action}- Total PII detectionspolicy_evaluations_total{tool, direction, policy_id}- Total policy evaluationswebhook_events_total{event_type, status}- Total webhook events emitteddlq_events_total{error_type}- Total DLQ events
precheck_duration_seconds{user_id, tool}- Precheck request durationpostcheck_duration_seconds{user_id, tool}- Postcheck request durationpolicy_evaluation_duration_seconds{tool, policy_id}- Policy evaluation durationpii_detection_duration_seconds{pii_type}- PII detection durationwebhook_duration_seconds{status}- Webhook request duration
active_requests{endpoint}- Number of active requestspolicy_cache_size- Number of policies in cachedlq_size- Number of events in DLQ
precheck_service_info{version, build_date, git_commit}- Service information
Every policy decision emits a webhook event with the following schema:
{
"type": "INGEST",
"channel": "org:YOUR_ORG:decisions",
"schema": "decision.v1",
"idempotencyKey": "precheck-1735229123456-abc123def",
"data": {
"orgId": "YOUR_ORG_ID",
"direction": "precheck",
"decision": "transform",
"tool": "verify_identity",
"scope": "net.external",
"rawText": "User email: alice@example.com, SSN: 123-45-6789",
"rawTextOut": "User email: alice@example.com, SSN: pii_8797942a",
"reasons": ["pii.allowed:PII:email_address","pii.tokenized:PII:us_ssn"],
"detectorSummary": {
"reasons": ["pii.allowed:PII:email_address","pii.tokenized:PII:us_ssn"],
"confidence": 0.85,
"piiDetected": ["email_address", "us_ssn"]
},
"payloadHash": "sha256:a1b2c3d4e5f6...",
"latencyMs": 45,
"correlationId": "req-123",
"tags": ["production", "api-call"],
"ts": "2024-12-26T10:15:30.123Z",
"authentication": {
"userId": "u1",
"apiKey": "GAI_LOCAL_DEV_ABC"
}
}
}type: Event type, always"INGEST"for decision eventschannel: WebSocket channel format"org:{ORG_ID}:decisions"schema: Schema version, currently"decision.v1"idempotencyKey: Unique key for deduplication (format:{direction}-{timestamp}-{correlation_id})
orgId: Organization identifier (configurable viaORG_IDenvironment variable)direction:"precheck"for ingress,"postcheck"for egressdecision: Policy decision (allow,deny,transform,confirm)tool: Tool name from the requestscope: Network scope from the requestrawText: Original raw text input from the requestrawTextOut: Processed text output with redundant values replacedreasons: Array of reason codes explaining the decisiondetectorSummary: PII detection results and confidencereasons: Array of reason codes explaining the decisionconfidence: Calculated confidence score (0.0-1.0) based on PII detection actionspiiDetected: Array of detected PII types (e.g.,["email_address", "us_ssn"])
payloadHash: SHA256 hash of the request payload for integrity verificationlatencyMs: Processing time in millisecondscorrelationId: Correlation ID for request trackingtags: Array of strings for categorization (currently empty, configurable)ts: ISO 8601 timestamp of the decisionauthentication: Authentication information from the requestuserId: User ID extracted from the URL path parameterapiKey: API key from the request header
The service uses WebSocket connections to send webhook events instead of HTTP POST requests:
URL Format: ws://host:port/api/ws/gateway?key=API_KEY&org=ORG_ID&channels=CHANNEL_LIST
Example:
ws://172.16.10.59:3002/api/ws/gateway?key=gai_827eode3nxa&org=dfy&channels=org:cmfzriajm0003fyp86ocrgjoj:decisions,org:cmfzriajm0003fyp86ocrgjoj:postcheck,org:cmfzriajm0003fyp86ocrgjoj:dlq,org:cmfzriajm0003fyp86ocrgjoj:precheck,org:cmfzriajm0003fyp86ocrgjoj:approvals,user:cmfzriaip0000fyp81gjfkri9:notifications
Parsed Values:
orgId: Extracted fromorgparameter (dfy)channel: Extracted fromchannelsparameter, finds the:decisionschannel (org:cmfzriajm0003fyp86ocrgjoj:decisions)apiKey: Extracted fromkeyparameter (gai_827eode3nxa)
Dynamic Configuration: All values are dynamically extracted from the webhook URL. If parsing fails or values (orgId or channel) are not found in the URL, they will be null in the webhook event, but the event will still be emitted. The service will always process the request and emit webhook events regardless of URL parsing success.
version: v1
defaults:
ingress:
action: redact # or deny | pass_through | tokenize
egress:
action: redact
tool_access:
verify_identity:
direction: ingress # only apply on precheck
allow_pii:
PII:email_address: pass_through # tool may receive raw email
PII:us_ssn: tokenize # tool must get token, not raw
send_marketing_email:
direction: ingress
allow_pii:
PII:email_address: pass_through
data_export:
direction: egress # only apply on postcheck
allow_pii:
PII:email_address: pass_through # allow email in export
PII:us_ssn: tokenize # tokenize SSN in export
audit_log:
direction: egress # only apply on postcheck
allow_pii:
PII:email_address: pass_through # allow email in audit logs
# SSN will be redacted (default behavior)
# default: everything else redacts/denies (your current behavior)pass_through: Allow raw PII value to pass through unchangedtokenize: Replace PII with stable token (e.g.,pii_8797942a)redact: Apply standard redaction (e.g.,<USER_EMAIL>,<USER_SSN>)deny: Block the request entirely
The policy file supports global defaults for each direction:
ingress: Default action for precheck requestsegress: Default action for postcheck requests- Tool-specific rules override global defaults
- Fallback: Safe redaction if no rules apply
The policy evaluation system follows a strict precedence hierarchy (highest to lowest priority):
- Purpose: Hard deny for dangerous tools
- Tools:
python.exec,bash.exec,code.exec,shell.exec - Decision: Always
deny - Policy ID:
deny-exec - Reason:
blocked tool: code/exec
- Purpose: Tool-specific rules in
policy.tool_access.yaml - Condition: Tool exists in
tool_accesssection and direction matches - Actions:
pass_through,tokenize,redact,deny - Policy ID:
tool-access - Override: Takes precedence over all lower levels
- Purpose: Global defaults for direction (ingress/egress)
- Condition: No tool-specific rule applies
- Actions:
pass_through,tokenize,redact,deny - Policy ID:
defaults - Override: Takes precedence over network scope and fallback
- Purpose: Network scope redaction for external tools
- Condition: Scope starts with
net.or tool starts withweb.,http.,fetch.,request. - Actions: Always
redact(PII detection and redaction) - Policy ID:
net-redact-presidioornet-redact-regex - Override: Takes precedence over safe fallback
- Purpose: Block only strict PII types (SSN and passwords)
- Condition: No other rules apply
- Actions:
denyfor SSN/passwords,allowfor everything else - Policy ID:
strict-fallback - Override: Final fallback for safety
| Tool | Scope | Direction | Rule Applied | Policy ID | Reason |
|---|---|---|---|---|---|
python.exec |
net.external |
ingress |
DENY_TOOLS | deny-exec |
blocked tool: code/exec |
verify_identity |
net.external |
ingress |
TOOL_SPECIFIC | tool-access |
pii.allowed:PII:email_address |
unknown_tool |
net.external |
ingress |
GLOBAL_DEFAULTS | defaults |
default.ingress.redact |
web.fetch |
internal |
ingress |
NETWORK_SCOPE | net-redact-presidio |
pii.redacted:email_address |
safe_tool |
local |
ingress |
STRICT_FALLBACK | strict-fallback |
strict_fallback.allow |
any_tool |
local |
ingress |
STRICT_FALLBACK | strict-fallback |
strict_pii_blocked:PII:us_ssn |
ingress: Apply only on precheck (before tool execution)egress: Apply only on postcheck (after tool execution)
| PII Type | Presidio Entity | Example |
|---|---|---|
EMAIL_ADDRESS |
alice@example.com |
|
| SSN | US_SSN |
123-45-6789 |
| Phone | PHONE_NUMBER |
+1-555-123-4567 |
| Credit Card | CREDIT_CARD |
4111-1111-1111-1111 |
| API Key | API_KEY |
sk-1234567890abcdef |
| JWT Token | JWT_TOKEN |
eyJhbGciOiJIUzI1NiIs... |
- Presidio (Primary): Advanced NLP-based detection with custom recognizers
- Regex (Fallback): Pattern-based detection when Presidio unavailable
- Context-aware filtering: Reduces false positives based on field names and patterns
def tokenize(value: str) -> str:
"""Create a stable token for PII values"""
return f"pii_{hashlib.sha256((TOKEN_SALT + value).encode()).hexdigest()[:8]}"- Format:
pii_{8-char-hash} - Stable: Same input always produces same token
- Configurable salt: Set via
PII_TOKEN_SALTenvironment variable - Privacy-preserving: Cannot reverse-engineer original value without salt
| Original Value | Token |
|---|---|
alice@example.com |
pii_a70ae1e6 |
123-45-6789 |
pii_8797942a |
+1-555-123-4567 |
pii_b82c4f1d |
app/
├── main.py # FastAPI application entry point
├── api.py # API endpoints and webhook handling
├── models.py # Pydantic models for requests/responses
├── policies.py # Policy evaluation and PII processing
├── events.py # Event emission and DLQ handling
├── log.py # Structured audit logging
├── auth.py # API key authentication
├── rate_limit.py # Rate limiting implementation
├── storage.py # Data persistence layer
└── settings.py # Configuration management
graph TD
A[Request] --> B[Rate Limiting]
B --> C[Authentication]
C --> D[Policy Evaluation]
D --> E{Tool in Policy?}
E -->|Yes| F[Apply Tool Access Rules]
E -->|No| G[Apply Default Rules]
F --> H[PII Detection]
G --> H
H --> I[Transform Payload]
I --> J[Return Response]
J --> K[Emit Webhook Event]
K --> L{Webhook Success?}
L -->|Yes| M[Event Delivered]
L -->|No| N[Retry with Backoff]
N --> O{Max Retries?}
O -->|No| N
O -->|Yes| P[Write to DLQ]
J --> Q[Audit Log]
Q --> R[Return Response]
graph TD
A[Payload] --> B[PII Detection]
B --> C{Found PII?}
C -->|No| D[Pass Through]
C -->|Yes| E[Check Tool Policy]
E --> F{Action?}
F -->|pass_through| G[Keep Original]
F -->|tokenize| H[Generate Token]
F -->|default| I[Apply Redaction]
G --> J[Return Transformed Payload]
H --> J
I --> J
The precheck service includes comprehensive budget management to track and control costs for both LLM usage and purchases. Budget tracking is integrated into the policy evaluation process and can block or warn about requests that exceed budget limits.
- LLM Cost Estimation: Automatic cost calculation based on model type and text length
- Purchase Detection: Extract purchase amounts from tool metadata fields
- Model Support: Support for major LLM providers (GPT-4, Claude, etc.)
- Token Estimation: Rough token estimation based on text character count
- User-level Budgets: Individual budget limits per user
- Monthly Reset: Automatic budget reset at month boundaries
- Dual Tracking: Separate tracking for LLM costs and purchase costs
- Transaction History: Complete audit trail of all budget transactions
- Hard Limits: Block requests that would exceed budget
- Warning Thresholds: Require confirmation when approaching budget limits
- Graceful Degradation: Fallback behavior when budget checking fails
CREATE TABLE budgets (
user_id VARCHAR PRIMARY KEY,
monthly_limit FLOAT DEFAULT 10.0,
current_spend FLOAT DEFAULT 0.0,
llm_spend FLOAT DEFAULT 0.0,
purchase_spend FLOAT DEFAULT 0.0,
budget_type VARCHAR DEFAULT 'user',
last_reset DATETIME DEFAULT CURRENT_TIMESTAMP,
is_active BOOLEAN DEFAULT TRUE
);CREATE TABLE budget_transactions (
id INTEGER PRIMARY KEY AUTOINCREMENT,
user_id VARCHAR NOT NULL,
transaction_type VARCHAR NOT NULL, -- 'llm' or 'purchase'
amount FLOAT NOT NULL,
description VARCHAR,
tool VARCHAR,
correlation_id VARCHAR,
created_at DATETIME DEFAULT CURRENT_TIMESTAMP
);The precheck service now accepts enhanced request formats with budget information:
{
"tool": "weather.current",
"raw_text": "Get weather for Berlin",
"scope": "precheck",
"corr_id": "unique_id",
"tool_config": {
"tool_name": "weather.current",
"scope": "net.external",
"direction": "ingress",
"metadata": {
"purchase_amount": 25.99,
"amount": 25.99,
"price": 25.99,
"cost": 25.99,
"vendor": "Shopify",
"category": "software",
"description": "Premium plan upgrade",
"currency": "USD",
"model": "gpt-4"
}
},
"policy_config": {
"version": "v1",
"model": "gpt-4",
"defaults": {
"ingress": {"action": "redact"},
"egress": {"action": "redact"}
},
"tool_access": {
"weather.current": {
"direction": "both",
"action": "allow"
}
},
"deny_tools": ["python.exec", "bash.exec"],
"on_error": "block"
}
}The precheck service now returns enhanced responses with budget information:
{
"decision": "allow",
"raw_text_out": "processed_text",
"reasons": ["budget_check_passed"],
"policy_id": "tool-access",
"ts": 1234567890,
"budget_status": {
"allowed": true,
"currentSpend": 3.50,
"limit": 10.00,
"remaining": 6.50,
"percentUsed": 35.0,
"reason": "budget_ok"
},
"budget_info": {
"monthly_limit": 10.00,
"current_spend": 3.50,
"llm_spend": 2.00,
"purchase_spend": 1.50,
"remaining_budget": 6.50,
"estimated_cost": 0.05,
"estimated_purchase": 25.99,
"projected_total": 29.54,
"percent_used": 35.0,
"budget_type": "user"
}
}The budget system can return three types of decisions:
allow: Budget is within limits, request can proceedconfirm: Budget warning - user confirmation requireddeny: Budget exceeded - request blocked
The service supports cost estimation for major LLM providers:
| Model | Input Cost (per token) | Output Cost (per token) |
|---|---|---|
| GPT-4 | $0.00003 | $0.00006 |
| GPT-4 Turbo | $0.00001 | $0.00003 |
| GPT-3.5 Turbo | $0.0000015 | $0.000002 |
| Claude-3 Opus | $0.000015 | $0.000075 |
| Claude-3 Sonnet | $0.000003 | $0.000015 |
| Claude-3 Haiku | $0.00000025 | $0.00000125 |
Budget limits can be configured per user:
- Default Monthly Limit: $10.00
- Budget Type:
userororganization - Reset Schedule: Monthly (automatic)
- Warning Threshold: 90% of budget used
| Variable | Description | Default |
|---|---|---|
PII_TOKEN_SALT |
Salt for token generation | default-salt-change-in-production |
PRECHECK_DLQ |
Dead letter queue path | /tmp/precheck.dlq.jsonl |
WEBHOOK_URL |
Webhook URL for events | None |
WEBHOOK_SECRET |
Secret for HMAC signing | dev-secret |
WEBHOOK_TIMEOUT_S |
Webhook request timeout | 2.5 |
WEBHOOK_MAX_RETRIES |
Maximum retry attempts | 3 |
WEBHOOK_BACKOFF_BASE_MS |
Base backoff delay in ms | 150 |
ON_ERROR |
Error handling behavior | block |
POLICY_FILE |
Policy file path | policy.tool_access.yaml |
USE_PRESIDIO |
Enable Presidio PII detection | true |
PRESIDIO_MODEL |
spaCy model for Presidio | en_core_web_sm |
| Variable | Description | Default |
|---|---|---|
DEMO_API_KEY |
Demo API key for testing | GAI_LOCAL_DEV_ABC |
API_KEY_HEADER |
Header name for API key | X-Governs-Key |
- API key passed to WebSocket for authentication handling
- No API-level authentication enforcement
- API key extracted from
X-Governs-Keyheader and forwarded to webhook events
- 100 requests per minute per user
- Configurable limits and windows
- Redis-based rate limiting (optional)
- Multiple redaction strategies
- Stable tokenization for consistent processing
- Configurable salt for token generation
- False positive filtering
- HMAC authentication: SHA-256 based signature verification
- Retry logic: Exponential backoff with configurable attempts
- Dead letter queue: Failed deliveries stored in JSONL format
- Configurable timeouts: Customizable request timeouts and retry delays
- Fire-and-forget: Non-blocking event emission to maintain response times
dependencies = [
"fastapi>=0.104.0",
"uvicorn[standard]>=0.24.0",
"pydantic>=2.5.0",
"pydantic-settings>=2.1.0",
"presidio-analyzer>=2.2.0",
"presidio-anonymizer>=2.2.0",
"spacy>=3.7.0",
"phonenumbers>=8.13.0",
"sqlalchemy>=2.0.0",
"psycopg2-binary>=2.9.0",
"redis>=5.0.0",
"python-multipart>=0.0.6",
"pyyaml>=6.0.0",
"httpx>=0.25.0",
"websockets>=12.0",
"prometheus-client>=0.19.0",
]# Development
python -m uvicorn app.main:app --host 0.0.0.0 --port 8080 --reload
# Production
python -m uvicorn app.main:app --host 0.0.0.0 --port 8080FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
CMD ["python", "-m", "uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8080"]curl -X POST http://localhost:8080/api/v1/precheck \
-H "X-Governs-Key: GAI_LOCAL_DEV_ABC" \
-H "Content-Type: application/json" \
-d '{
"tool": "verify_identity",
"scope": "net.external",
"raw_text": "User email: alice@example.com, SSN: 123-45-6789",
"corr_id": "req-123"
}'Expected Response:
{
"decision": "transform",
"raw_text_out": "User email: alice@example.com, SSN: pii_8797942a",
"reasons": [
"pii.allowed:PII:email_address",
"pii.tokenized:PII:us_ssn"
],
"policy_id": "tool-access",
"ts": 1758745697
}curl -X POST http://localhost:8080/api/v1/precheck \
-H "X-Governs-Key: GAI_LOCAL_DEV_ABC" \
-H "Content-Type: application/json" \
-d '{
"tool": "send_marketing_email",
"scope": "net.external",
"raw_text": "Send email to alice@example.com, SSN: 123-45-6789",
"corr_id": "req-124"
}'Expected Response:
{
"decision": "transform",
"raw_text_out": "Send email to alice@example.com, SSN: <USER_SSN>",
"reasons": [
"pii.allowed:PII:email_address",
"pii.redacted:PII:us_ssn"
],
"policy_id": "tool-access",
"ts": 1758745697
}curl -X POST http://localhost:8080/api/v1/postcheck \
-H "X-Governs-Key: GAI_LOCAL_DEV_ABC" \
-H "Content-Type: application/json" \
-d '{
"tool": "data_export",
"scope": "net.external",
"raw_text": "Export data for alice@example.com, SSN: 123456789",
"corr_id": "req-125"
}'Expected Response:
{
"decision": "transform",
"raw_text_out": "Export data for alice@example.com, SSN: pii_a70ae1e6",
"reasons": [
"pii.allowed:PII:email_address",
"pii.tokenized:PII:us_ssn"
],
"policy_id": "tool-access",
"ts": 1758748082
}curl -X POST http://localhost:8080/api/v1/postcheck \
-H "X-Governs-Key: GAI_LOCAL_DEV_ABC" \
-H "Content-Type: application/json" \
-d '{
"tool": "audit_log",
"scope": "net.external",
"raw_text": "Audit log for alice@example.com, SSN: 123456789",
"corr_id": "req-126"
}'Expected Response:
{
"decision": "transform",
"raw_text_out": "Audit log for alice@example.com, SSN: <USER_SSN>",
"reasons": [
"pii.allowed:PII:email_address",
"pii.redacted:PII:us_ssn"
],
"policy_id": "tool-access",
"ts": 1758748185
}- 2025-01-14: API Route Updates: Simplified API routes and improved user_id handling
- Route Changes: Added
/apiprefix to all routes (/api/v1/precheck,/api/v1/postcheck, etc.) - User ID Handling: Made
user_idoptional in request payload, with fallback extraction from webhook URL - Rate Limiting: Updated to work with either user_id or API key for rate limiting
- WebSocket Integration: Improved user_id extraction from webhook URL path for WebSocket authentication
- Documentation: Updated all API examples and test cases to use new route format
- Route Changes: Added
- 2024-12-26: BREAKING CHANGE: Migrated API from payload-based to raw text-based processing
- Request Model: Changed
payloadfield toraw_text(string) inPrePostCheckRequest - Response Model: Changed
payload_outfield toraw_text_out(string) inDecisionResponse - Policy Evaluation: Updated
evaluate()function to process raw text instead of JSON payloads - PII Processing: Added
apply_tool_access_text()function for text-based PII transformations - API Endpoints: Updated precheck and postcheck endpoints to handle raw text input/output
- Webhook Events: Updated payload hash calculation to use raw text instead of JSON serialization
- Documentation: Updated all API examples and test cases to use raw text format
- Backward Compatibility: Maintained legacy model aliases for gradual migration
- WebSocket Authentication: Added
authenticationobject to webhook events containinguserIdandapiKey - Removed API Authentication: Removed
X-Governs-Keyauthentication enforcement from API endpoints, now passed to WebSocket for handling - Removed Webhook Validation: Removed validation that prevented webhook emission when
orgIdorchannelareNone- webhook events are now always sent - Enhanced Webhook Events: Added
rawText,rawTextOut, andreasonsfields to webhook events for complete request/response context
- Request Model: Changed
- 2024-12-26: Updated webhook payload structure to match new API documentation format
- Changed from flat event structure to nested structure with
type,channel,schema,idempotencyKey, anddatafields - Updated direction mapping from
ingress/egresstoprecheck/postcheck - Added PII detection extraction with confidence scoring
- Added payload hash calculation using SHA256
- Added automatic parsing of organization ID, channel, and API key from webhook URL parameters
- Migrated all environment variable access to use settings module for proper .env file loading
- Removed all hardcoded fallback values - everything is now dynamically derived from webhook URL
- Added graceful handling when webhook URL parsing fails (skips webhook emission with warning)
- Fixed webhook protocol: Changed from HTTP POST to WebSocket connections
- Removed HTTP headers: WebSocket connections don't use Content-Type or X-Governs-Signature headers
- Updated timestamp format to ISO 8601
- Changed from flat event structure to nested structure with
- Policy Hot-reload: Reload policies without service restart
- Advanced Transformations: Support for
mask,hash,removeactions - Policy Versioning: Support for multiple policy versions
- Audit Logging: Comprehensive audit trail for policy decisions
- Policy Templates: Reusable policy templates for common patterns
- Bidirectional Policies: Tools that need different rules for ingress vs egress
- Database Integration: Store policies in database for dynamic updates
- External Policy Service: Integration with external policy management systems
- Real-time Monitoring: Integration with monitoring and alerting systems
- Policy Analytics: Analytics and reporting on policy decisions
- Clone the repository
- Install dependencies:
pip install -e .[dev] - Install spaCy model:
python -m spacy download en_core_web_sm - Run tests:
pytest - Start development server:
python -m uvicorn app.main:app --reload
- Black: Code formatting
- isort: Import sorting
- flake8: Linting
- mypy: Type checking
- pytest: Test framework
- pytest-asyncio: Async test support
- httpx: HTTP client for API testing
MIT License - see LICENSE file for details.
For questions, issues, or contributions, please refer to the project repository or contact the GovernsAI team.