Skip to content

Commit aed4868

Browse files
docs(vault): comprehensive update for v0.14.0 security hardening
8 docs updated to cover all v0.14.0 changes: - security.md: ReDoS protection, Unicode NFC, path traversal, CLI error sanitization, timeout cancellation, FastAPI validation bounds - multi-tenancy.md: tenant lock enforcement behavior, atomic quota check mechanism (no TOCTOU) - plugins.md: manifest.json now required by default, generation example, security rules for verification - fastapi.md: limit/offset/content validation bounds, response caching section with TTL docs - membrane.md: 500KB content limit for regex scanning - api-reference.md: version bump, constructor behavior docs - architecture.md: security architecture diagram, config field updates - cli.md: error sanitization note with structured codes
1 parent e9260dd commit aed4868

File tree

8 files changed

+126
-27
lines changed

8 files changed

+126
-27
lines changed

docs/api-reference.md

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# API Reference
22

3-
Complete Python SDK for qp-vault v0.13.0.
3+
Complete Python SDK for qp-vault v0.14.0.
44

55
## Constructor
66

@@ -20,7 +20,11 @@ Vault(
2020
)
2121
```
2222

23-
<!-- VERIFIED: vault.py:132-145 -->
23+
When `tenant_id` is set, the vault enforces tenant isolation: operations auto-inject the locked tenant, and operations with a mismatched `tenant_id` raise `VaultError`.
24+
25+
When `role` is set, all operations are checked against the RBAC permission matrix. Operations exceeding the role's permissions raise `VaultError` with code `VAULT_700`.
26+
27+
<!-- VERIFIED: vault.py:132-145, 257-277 — constructor + _resolve_tenant -->
2428

2529
### Factory Methods
2630

docs/architecture.md

Lines changed: 24 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -269,14 +269,34 @@ config = VaultConfig(
269269
vault = Vault("./knowledge", config=config)
270270
```
271271

272-
Additional config fields added in v0.6-v0.13:
272+
Additional config fields added in v0.6-v0.14:
273273

274274
```python
275275
config = VaultConfig(
276-
max_resources_per_tenant=1000, # Per-tenant quota (v0.11)
277-
query_timeout_ms=30000, # 30s query timeout (v0.13)
278-
health_cache_ttl_seconds=30, # Cache health responses (v0.13)
276+
max_resources_per_tenant=1000, # Per-tenant quota, atomic count enforcement (v0.11)
277+
query_timeout_ms=30000, # Query timeout with task cancellation (v0.14)
278+
health_cache_ttl_seconds=30, # TTL cache for health/status responses (v0.14)
279279
)
280280
```
281281

282282
<!-- VERIFIED: config.py:18-86 — VaultConfig fields and defaults -->
283+
284+
## Security Architecture (v0.14.0)
285+
286+
```
287+
+-------------------------------------------------------------------+
288+
| TENANT ISOLATION _resolve_tenant: lock, auto-inject, reject |
289+
+-------------------------------------------------------------------+
290+
| RBAC _check_permission: 30+ operations gated |
291+
+-------------------------------------------------------------------+
292+
| INPUT VALIDATION Unicode NFC, path traversal, null bytes, limits|
293+
+-------------------------------------------------------------------+
294+
| MEMBRANE Innate scan (500KB regex) + Release gate |
295+
+-------------------------------------------------------------------+
296+
| TIMEOUT _with_timeout: asyncio cancel + cleanup |
297+
+-------------------------------------------------------------------+
298+
| AUDIT VaultEvent for every mutation + Capsule chain |
299+
+-------------------------------------------------------------------+
300+
```
301+
302+
<!-- VERIFIED: vault.py:215-277 — permission, tenant, timeout methods -->

docs/cli.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,10 @@ The `vault` CLI provides 15 commands for managing governed knowledge stores.
66
pip install qp-vault[cli]
77
```
88

9+
Error messages display structured vault error codes (e.g., `[VAULT_300] Invalid lifecycle transition`) without exposing internal paths, SQL, or stack traces.
10+
11+
<!-- VERIFIED: cli/main.py:28-35 — _safe_error_message -->
12+
913
## Commands
1014

1115
### vault init

docs/fastapi.md

Lines changed: 25 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -82,14 +82,31 @@ app.include_router(router, prefix="/v1/vault")
8282

8383
## Input Validation
8484

85-
| Field | Constraint |
86-
|-------|-----------|
87-
| `query` | Max 10,000 characters |
88-
| `top_k` | 1-1,000 |
89-
| `threshold` | 0.0-1.0 |
90-
| Batch sources | Max 100 items |
91-
92-
<!-- VERIFIED: integrations/fastapi_routes.py:50-53 — SearchRequest validators -->
85+
All endpoints validate inputs at the API boundary before reaching vault logic.
86+
87+
| Field | Constraint | Endpoint |
88+
|-------|-----------|----------|
89+
| `content` | Max 500MB | `POST /resources` |
90+
| `query` | Max 10,000 characters | `POST /search` |
91+
| `top_k` | 1-1,000 | `POST /search` |
92+
| `threshold` | 0.0-1.0 | `POST /search` |
93+
| `limit` | 1-1,000 | `GET /resources` |
94+
| `offset` | 0-1,000,000 | `GET /resources` |
95+
| Batch sources | Max 100 items | `POST /batch` |
96+
| `as_of` | Valid ISO date | `POST /search` |
97+
98+
<!-- VERIFIED: integrations/fastapi_routes.py:40 — content max_length -->
99+
<!-- VERIFIED: integrations/fastapi_routes.py:51-53 — SearchRequest validators -->
100+
<!-- VERIFIED: integrations/fastapi_routes.py:140-141 — limit/offset Query validators -->
101+
102+
## Response Caching
103+
104+
`GET /health` and `GET /status` responses are cached with a configurable TTL (default 30 seconds). The cache is invalidated on any write operation (add, update, delete).
105+
106+
Configure via `VaultConfig(health_cache_ttl_seconds=60)`.
107+
108+
<!-- VERIFIED: vault.py:947-955 — health cache -->
109+
<!-- VERIFIED: vault.py:1026-1031 — status cache -->
93110

94111
## Error Codes
95112

docs/membrane.md

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,13 @@ Default patterns detect:
3838
- XSS: `<script>`, `javascript:`
3939
- Code injection: `eval()`, `exec()`, `__import__()`, `subprocess.`, `os.system()`
4040

41-
<!-- VERIFIED: membrane/innate_scan.py:20-33 — DEFAULT_BLOCKLIST -->
41+
<!-- VERIFIED: membrane/innate_scan.py:20-35 — DEFAULT_BLOCKLIST -->
42+
43+
## Security Limits
44+
45+
Content is truncated to **500KB** before regex scanning to prevent catastrophic backtracking (ReDoS). Full content is still stored and indexed; only the scan input is bounded. Patterns are pre-compiled for validation before use.
46+
47+
<!-- VERIFIED: membrane/innate_scan.py:69 — 500KB scan_content limit -->
4248

4349
## Custom Blocklist
4450

docs/multi-tenancy.md

Lines changed: 18 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -30,11 +30,21 @@ For stricter isolation, lock the vault to a single tenant at construction:
3030

3131
```python
3232
vault = Vault("./knowledge", tenant_id="site-123")
33-
# All operations are now scoped to site-123
34-
# No need to pass tenant_id on every call
33+
34+
# All operations auto-inject tenant_id="site-123"
35+
vault.add("doc") # tenant_id="site-123" applied automatically
36+
vault.search("query") # scoped to site-123
37+
38+
# Mismatched tenant_id is rejected
39+
vault.add("doc", tenant_id="site-456") # Raises VaultError: tenant mismatch
3540
```
3641

37-
<!-- VERIFIED: vault.py:143-148 — _locked_tenant_id -->
42+
When a vault is tenant-locked:
43+
- Operations with no `tenant_id` auto-inject the locked tenant
44+
- Operations with a matching `tenant_id` proceed normally
45+
- Operations with a different `tenant_id` raise `VaultError`
46+
47+
<!-- VERIFIED: vault.py:257-277 — _resolve_tenant enforcement -->
3848

3949
## Per-Tenant Quotas
4050

@@ -47,9 +57,14 @@ config = VaultConfig(max_resources_per_tenant=1000)
4757
vault = Vault("./knowledge", config=config)
4858

4959
vault.add("doc", tenant_id="site-123") # OK until quota reached
60+
# After 1000 resources: raises VaultError("Tenant site-123 has reached the resource limit")
5061
```
5162

63+
Quotas are enforced with an atomic `COUNT(*)` query at the storage layer. No TOCTOU race condition window.
64+
5265
<!-- VERIFIED: config.py:68 — max_resources_per_tenant -->
66+
<!-- VERIFIED: vault.py:406-416 — atomic count_resources check -->
67+
<!-- VERIFIED: storage/sqlite.py:595-601 — count_resources implementation -->
5368

5469
## Storage
5570

docs/plugins.md

Lines changed: 23 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -111,16 +111,37 @@ vault = Vault("./knowledge", plugins_dir="/opt/qp/plugins/")
111111

112112
Drop `.py` files in the plugins directory. Any class decorated with `@embedder`, `@parser`, or `@policy` is auto-discovered.
113113

114+
**A `manifest.json` is required.** This file maps each plugin filename to its SHA3-256 hash. Without it, the entire directory is skipped for security.
115+
114116
```
115117
/opt/qp/plugins/
118+
manifest.json # Required: SHA3-256 hashes for each .py file
116119
my_embedder.py # Contains @embedder("local-model") class
117120
dicom_parser.py # Contains @parser("dicom") class
118121
itar_policy.py # Contains @policy("itar") class
119122
```
120123

121-
<!-- VERIFIED: plugins/registry.py:126-161 — discover_plugins_dir() loads .py files -->
124+
Generate the manifest:
125+
126+
```python
127+
import hashlib, json, pathlib
128+
129+
plugins_dir = pathlib.Path("/opt/qp/plugins")
130+
manifest = {}
131+
for f in sorted(plugins_dir.glob("*.py")):
132+
if not f.name.startswith("_"):
133+
manifest[f.name] = hashlib.sha3_256(f.read_bytes()).hexdigest()
134+
(plugins_dir / "manifest.json").write_text(json.dumps(manifest, indent=2))
135+
```
136+
137+
Security rules:
138+
- Files not listed in the manifest are rejected
139+
- Hash mismatches are logged and the file is skipped
140+
- Files starting with `_` are always skipped
141+
- Broken files log a warning and are skipped
142+
- To disable hash verification: `discover_plugins_dir(path, verify_hashes=False)` (not recommended)
122143

123-
Files starting with `_` are skipped. Broken files log a warning and are skipped.
144+
<!-- VERIFIED: plugins/registry.py:131-189 — manifest required, hash verification -->
124145

125146
## Discovery Order
126147

docs/security.md

Lines changed: 19 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Security Model
22

3-
qp-vault's security architecture for v0.13.0.
3+
qp-vault's security architecture for v0.14.0.
44

55
## Cryptographic Inventory
66

@@ -44,7 +44,8 @@ Quarantined resources get `ResourceStatus.QUARANTINED` and are excluded from sea
4444
| Input | Validation |
4545
|-------|-----------|
4646
| Trust tier | Must be valid enum value |
47-
| Resource name | Path traversal stripped, null bytes removed, 255 char limit |
47+
| Resource name | Unicode NFC normalized, path traversal stripped, null bytes removed, 255 char limit |
48+
| Source path | Resolved and rejected if `..` detected (path traversal protection) |
4849
| Tags | Max 50 tags, 100 chars each, control chars stripped |
4950
| Metadata keys | Alphanumeric + dash/underscore/dot, max 100 keys |
5051
| Metadata values | Max 10,000 bytes per value |
@@ -60,7 +61,9 @@ All queries use parameterized placeholders (`?` for SQLite, `$N` for PostgreSQL)
6061

6162
## Plugin Security
6263

63-
Plugins loaded from `plugins_dir` are verified against a `manifest.json` containing SHA3-256 hashes before execution. Hash mismatches are logged and the plugin is skipped.
64+
Plugins loaded from `plugins_dir` require a `manifest.json` (SHA3-256 hashes) by default. Without a manifest, the entire directory is skipped. Files not listed in the manifest are rejected. Hash mismatches are logged and the plugin is skipped.
65+
66+
<!-- VERIFIED: plugins/registry.py:131-176 — manifest required, unlisted files rejected -->
6467

6568
## Key Management
6669

@@ -98,14 +101,21 @@ Every chunk: SHA3-256 CID. Every resource: Merkle root over sorted chunk CIDs. E
98101

99102
| Vector | Protection |
100103
|--------|------------|
101-
| Large upload | max_file_size_mb (default 500MB) |
104+
| Large upload | max_file_size_mb (default 500MB), content max_length 500MB |
102105
| Unbounded queries | Paginated with 50K hard cap |
103106
| Batch flooding | Max 100 items per request |
104107
| Search params | top_k max 1000, query max 10K chars |
108+
| List params | limit 1-1000, offset 0-1M (FastAPI validated) |
105109
| Chain cycles | Max 1000 links |
106110
| FTS5 complexity | Special operators stripped |
107-
| Tenant flooding | Per-tenant resource quotas |
108-
| Query timeout | Configurable query_timeout_ms (default 30s) |
111+
| Tenant flooding | Per-tenant quotas (atomic count, no TOCTOU) |
112+
| Query timeout | Configurable query_timeout_ms (default 30s), task cancelled on timeout |
113+
| Health/status abuse | TTL-cached responses (default 30s) |
114+
| Membrane ReDoS | Content truncated to 500KB for regex scanning |
115+
116+
<!-- VERIFIED: vault.py:247-265 — _with_timeout with task cancellation -->
117+
<!-- VERIFIED: membrane/innate_scan.py:69 — 500KB scan limit -->
118+
<!-- VERIFIED: integrations/fastapi_routes.py:140-141 — limit/offset validation -->
109119

110120
## Threat Model
111121

@@ -118,7 +128,9 @@ Every chunk: SHA3-256 CID. Every resource: Merkle root over sorted chunk CIDs. E
118128
| Data exfiltration | DataClassification blocks CONFIDENTIAL/RESTRICTED |
119129
| Prompt injection | Membrane innate scan + quarantine |
120130
| SQL injection | Parameterized queries only |
121-
| Path traversal | Name sanitization strips path components |
131+
| Path traversal | Name sanitization + source path resolve rejects `..` |
132+
| Unicode homographs | NFC normalization prevents visually identical collisions |
133+
| CLI information leakage | Structured error codes, no raw exception output |
122134
| FTS5 injection | Query sanitizer strips operators |
123135
| Audit manipulation | Capsule hash-chains are append-only |
124136
| Key compromise | ML-KEM-768 (quantum-resistant) + zeroization |

0 commit comments

Comments
 (0)