All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
- Knowledge Graph: New
qp_vault.graphsubpackage. Access viavault.graphon any AsyncVault or Vault instance. - GraphEngine: Typed async CRUD for nodes, edges, mentions, traversal, merge, and scan. Every mutation fires a VaultEvent for capsule audit.
- GraphStorageBackend Protocol: 20-method storage contract. PostgreSQL (pg_trgm similarity, recursive CTE traversal) and SQLite (FTS5 search, Python BFS traversal) backends.
- KnowledgeExtractor: LLM-based entity/relationship extraction with membrane sanitization (NFKC normalization, HTML escaping, XML wrapping). Validation caps: 200 entities, 500 relationships.
- EntityResolver: Three-stage dedup cascade (exact match, FTS/trigram search, create-on-miss).
- EntityDetector: In-memory name matching (10k entity index, 50k text cap). Optional fuzzy mode via EntityResolver.
- EntityMaterializer: Generates
profile.md(wikilinks, properties, relationships, mentions) andmanifest.jsonper entity. - WikilinkResolver: Parse and resolve
[[Entity Name]]and[[Entity Name|Display Text]]syntax. Code fence exclusion. Case-insensitive dedup. - Graph-Augmented Search:
vault.search(query, graph_boost=True)detects entities in queries and boosts matching documents (15% relevance boost). - Membrane Sanitization:
sanitize_for_extraction()for LLM extraction input. - 10 Graph EventTypes:
ENTITY_CREATE,ENTITY_UPDATE,ENTITY_DELETE,EDGE_CREATE,EDGE_DELETE,ENTITY_MERGE,MENTION_TRACK,SCAN_START,SCAN_COMPLETE,SCAN_FAIL. graphoptional extra in pyproject.toml (no additional deps required).- 213 new graph tests (storage, capsule, intelligence, models, edge cases, materialization, extraction errors, security).
- Input validation at every boundary: name (500 chars), entity_type (50 chars), relation_type (100 chars), properties (50KB, 2000 chars/value), tags (50 max, 100 chars each), weight (0.0-1.0), null byte stripping.
- Self-edge rejection, self-merge rejection, direction enum validation, limit capping (10,000), context_for ID cap (50).
graph_schemaparameter validated as SQL identifier (^[a-zA-Z_][a-zA-Z0-9_]*$).source_labelin sanitization validated as alpha-only.- LLM extraction output: property key cap (20/entity, 100 chars), property value cap (500 chars).
PostgresBackend.__init__()acceptsgraph_schemaparameter (default"qp_vault", set to"quantumpipes"for Core migration compatibility).vault.search()acceptsgraph_boost: bool = Falseparameter.__init__.pyexportsGraphStorageBackend,GraphEngine,GraphNode,GraphEdge(lazy-loaded).
- PostgresBackend SSL: sslmode=prefer by default. Previously defaulted to
ssl=Truewhich forced SSL negotiation. Servers without TLS certificates (Docker, local dev, private cloud networks) rejected the connection. Now defaults tossl="prefer": try SSL first, fall back to plaintext on rejection. Matches the standardlibpqdefault behavior.
PostgresBackend.__init__()sslparameter:bool(defaultTrue) tostr(default"prefer"). Accepts"prefer","require","disable","verify-full","verify-ca".True/Falsestill accepted for backward compat.- DSN-level
sslmode=always takes precedence over the constructorsslargument. - 5 new tests for SSL mode normalization and backward compatibility.
- FIX-4: Adversarial status persistence:
AdversarialVerifier.get_status()now falls back to the storage backend when the in-memory cache misses. Adversarial status (verified,suspicious,unverified) survives process restarts. Previously, all resources reverted toUNVERIFIEDon restart.
AdversarialVerifierconstructor accepts optionalstorageparameter for database-backed status reads- 4 new tests for adversarial persistence (set/get roundtrip, restart survival, default status, status transitions)
- FIX-1 (quarantine block), FIX-2 (classification enforcement), FIX-3 (freshness decay), FIX-5 (PostgreSQL SSL), FIX-6 (export/import content), FIX-7 (provenance auto-verify) were already resolved in prior releases. FIX-4 was the only remaining gap from the Wizard security audit.
- Event Subscription:
vault.subscribe(callback)registers sync or async callbacks for mutation events (CREATE, UPDATE, DELETE, LIFECYCLE_TRANSITION). Returns unsubscribe function. 5-second timeout on async callbacks. 100-subscriber cap. Error isolation: failing callbacks never block vault operations. - Reprocess:
vault.reprocess(resource_id)re-chunks and re-embeds existing resources. Useful when embedding models change or chunking parameters are updated. Emits UPDATE subscriber event withreprocessed=True. - Text-Only Search Fallback:
vault.search()automatically degrades to text-only mode (vector_weight=0.0,text_weight=1.0) when no embedder is configured. Search works on day one without requiring an embedding model. - Find By Name:
vault.find_by_name(name)case-insensitive resource lookup. Returns first matching non-deleted resource or None. - 8 New REST Endpoints:
POST /resources/{id}/reprocess,POST /grep,GET /resources/by-name,GET /resources/{old_id}/diff/{new_id},POST /resources/multiple,PATCH /resources/{id}/adversarial,POST /import,GET /resources/by-name(30 total endpoints). - 117 new tests across 6 test files (subscribe: 12, reprocess: 6, grep compat: 9, text fallback: 5, find_by_name: 5, integration: 46, security: 6, FastAPI: 15, cross-feature: 7, RBAC: 3, edge cases: 3)
- Replaced
assertstatements withif/raise VaultError(survives optimized bytecode) - Subscriber callback timeout (5s via
asyncio.wait_for) prevents event loop blocking - Subscriber cap (
_MAX_SUBSCRIBERS=100) prevents memory exhaustion - Snapshot-copy of subscriber list before iteration prevents mutation during notify
- Adversarial status endpoint validates against allowlist (
unverified,verified,suspicious,quarantined) - Import endpoint rejects path traversal (
..in path components) get_multipleendpoint coerces all IDs to strings (prevents type confusion)find_by_nameenforces RBACsearchpermission- Security score: 100/100 (bandit clean, pip-audit clean)
- Total tests: 871+ (up from 520 at v1.3.0)
- REST endpoints: 30 (up from 22)
- Documentation updated: api-reference.md, fastapi.md, streaming-and-telemetry.md
- World-class grep engine: Complete rewrite of
AsyncVault.grep()with three-signal blended scoring- Single-pass FTS5 OR query (SQLite): one database round-trip regardless of keyword count, replacing the previous N+1 per-keyword search loop
- Single-pass ILIKE + trigram query (PostgreSQL): per-keyword CASE expressions with
GREATEST(similarity(...))scoring - Three-signal scoring: keyword coverage (Lucene coord factor as multiplier), native text rank (FTS5 bm25 / pg_trgm), term proximity (cover density ranking)
- Keyword highlighting:
explain_metadata.snippetwith configurable markers - Scoring breakdown:
explain_metadataincludesmatched_keywords,hit_density,text_rank,proximity, andsnippet
StorageBackend.grep()protocol method: dedicated storage-level grep for both SQLite and PostgreSQL backendsgrep_utils.py: shared utilities for FTS5 query building, keyword sanitization, snippet generation, keyword matching, and proximity scoringGrepMatchdataclass: lightweight intermediate result type for storage-to-vault layer communicationVaultConfig.grep_rank_weightandVaultConfig.grep_proximity_weight: configurable scoring weights- 62 new grep tests across
test_grep.py(51 tests) andtest_grep_utils.py(31 tests)
- Encryption test skip guards:
test_v1_features.py,test_coverage_gaps.py,test_encryption.pynow correctly skip when[encryption]extra is not installed
- Grep scoring formula:
coverage * (rank_weight * text_rank + proximity_weight * proximity)replaces flat density-only scoring - Coverage acts as a multiplier (Lucene coord factor pattern): 3/3 keywords = full score, 1/3 = 33% score
1.0.0 - 2026-04-07
vault.upsert(source, name=...): add-or-replace atomically. Supersedes existing resource with same name + tenantvault.get_multiple(resource_ids): batch retrieval in a single query (added to Protocol, both backends)- Migration guide:
docs/migration.md - Deployment guide:
docs/deployment.md(PostgreSQL, SSL, encryption, scaling) - Troubleshooting guide:
docs/troubleshooting.md(error codes, common issues)
- BREAKING:
trustparameter renamed totrust_tieronadd(),list(),update(),add_batch(),replace() - BREAKING:
trust_minparameter renamed tomin_trust_tieronsearch(),search_with_facets() - BREAKING:
LayerDefaults.trustfield renamed toLayerDefaults.trust_tier - Classifier upgraded from "Alpha" to "Production/Stable"
0.16.0 - 2026-04-06
- Membrane ADAPTIVE_SCAN: LLM-based semantic content screening (Stage 3 of 8). Detects obfuscated prompt injection, encoded payloads, social engineering, and semantic attacks that regex patterns miss
- LLMScreener Protocol: Pluggable interface for any LLM backend. Implements structural subtyping (same pattern as EmbeddingProvider)
- OllamaScreener: Air-gap-safe screener using local Ollama instance. Hardened system prompt isolates content-under-review from instructions
- ScreeningResult dataclass: Structured result with risk_score (0.0-1.0), reasoning, and flags list
llm_screenerparameter onVault()andAsyncVault()constructors- Aggregate risk scoring in MembranePipelineStatus (max of non-skipped stages)
- Adaptive scan is optional: without an
llm_screener, the stage SKIPs (no LLM dependency required) - Content truncated to configurable max (default 4000 chars) before LLM evaluation
- LLM errors are caught and result in SKIP (never blocks ingestion due to LLM failure)
- System prompt hardened: content placed in
<document>block, explicit instruction not to follow commands within it
0.15.0 - 2026-04-06
- Membrane blocks dangerous content: FAIL result rejects outright (raises VaultError). Quarantined resources excluded from
get_content(). Adversarial status set to SUSPICIOUS on quarantine - PostgreSQL SSL by default:
asyncpgpool uses SSL unlesssslmode=disablein DSN. New config:postgres_ssl(default True),postgres_ssl_verify(default False) - SQLite file permissions: New databases created with
0600(owner-only rw). WAL and SHM files also restricted - ML-KEM-768 FIPS KAT: Roundtrip + tampered-ciphertext Known Answer Test added to
run_all_kat()
- Provenance self-sign trusted: Self-signed attestations now
signature_verified=True(was False when no verify_fn)
0.14.0 - 2026-04-06
- Tenant lock enforcement:
Vault(path, tenant_id="x")now actively rejects operations with mismatchedtenant_idand auto-injects the locked tenant when none is provided - Query timeouts:
_with_timeout()wraps storage search withasyncio.wait_forand proper task cancellation on timeout. PostgreSQL pool getscommand_timeoutparameter - Health/status response caching: TTL-based cache (default 30s via
health_cache_ttl_seconds) avoids full vault scans on repeated calls; cache invalidated on add/update/delete - Atomic tenant quotas:
count_resources()Protocol method replaces the previous list+offset approach, eliminating TOCTOU race condition
- Plugin manifest required:
manifest.jsonis now mandatory whenverify_hashes=True(default). Files not listed in manifest are rejected. Entire directory skipped if manifest missing - FastAPI validation:
limit(1-1000),offset(0-1M),contentmax_length (500MB) validated at API boundary - Path traversal protection:
add()resolves paths and rejects those containing.. - ReDoS protection: Membrane innate scan truncates content to 500KB before regex matching
- CLI error sanitization:
_safe_error_message()returns structured error codes, never raw exception details - Unicode normalization:
_sanitize_name()applies NFC normalization to prevent homograph collisions - Timeout cancellation: Timed-out tasks are cancelled (not left running in background)
- Sync Vault missing tenant_id/role:
Vault.__init__now accepts and passestenant_idandroletoAsyncVault(was silently ignoring both) - mypy strict compliance: 0 errors across 54 source files without disabling checks
- Abstraction leak:
create_collection()andlist_collections()now use Protocol methods instead of directly accessing_get_conn() - None-safety: Added null checks before
.valueaccess in resource_manager and search_engine
- All magic numbers extracted to named constants
- All 16 StorageBackend Protocol methods have docstrings
- Error message punctuation normalized
0.13.0 - 2026-04-07
- RBAC framework: Role enum (READER, WRITER, ADMIN) with permission matrix. Enforced at Vault API boundary.
- Key zeroization:
zeroize()function using ctypes memset for secure key erasure - FIPS Known Answer Tests:
run_all_kat()for SHA3-256 and AES-256-GCM self-testing - Structured error codes: All exceptions have machine-readable codes (VAULT_000 through VAULT_700)
- Query timeout config:
query_timeout_msin VaultConfig (default 30s) - Health response caching:
health_cache_ttl_secondsin VaultConfig (default 30s)
- RBAC permission checks on all Vault methods
- PermissionError (VAULT_700) for unauthorized operations
0.12.0 - 2026-04-06
- Post-quantum cryptography (delivered):
MLKEMKeyManager— ML-KEM-768 key encapsulation (FIPS 203)MLDSASigner— ML-DSA-65 digital signatures (FIPS 204)HybridEncryptor— ML-KEM-768 + AES-256-GCM hybrid encryption[pq]installation extra:pip install qp-vault[pq]
- Input bounds:
top_kcapped at 1000,thresholdrange 0-1, query max 10K chars - Batch limits: max 100 items per
/batchrequest - Plugin hash verification:
manifest.jsonwith SHA3-256 hashes in plugins_dir - Tenant-locked vault:
Vault(path, tenant_id="x")enforces single-tenant scope
- SearchRequest Pydantic validators prevent unbounded parameter attacks
- Plugin files verified against manifest before execution
0.11.0 - 2026-04-06
- Complete CLI: 8 new commands (content, replace, supersede, collections, provenance, export, health, list, delete, transition, expiring)
- Search faceting:
vault.search_with_facets()returns results + facet counts by trust tier, resource type, classification - FastAPI parity: 7 new endpoints (content, provenance, collections CRUD, faceted search, batch, export)
- Per-tenant quotas:
config.max_resources_per_tenantenforced invault.add() - Missing storage indexes:
data_classification,resource_typein SQLite and PostgreSQL
- CLI now has 15 commands (complete surface)
- FastAPI now has 22+ endpoints (complete surface)
0.10.0 - 2026-04-06
- Search intelligence: deduplication (one result per resource), pagination offset, explain mode (scoring breakdown)
- Knowledge self-healing: semantic near-duplicate detection, contradiction detection (trust/lifecycle conflicts)
- Real-time event streaming: VaultEventStream for subscribing to vault mutations
- Telemetry: VaultTelemetry with operation counters, latency, error rates
- Per-resource health: vault.health(resource_id) for individual quality assessment
- Import/export: vault.export_vault(path) and vault.import_vault(path) for portable vaults
[atlas]extra (no implementation; removed to avoid confusion)
0.9.0 - 2026-04-06
- Content Immune System (CIS): Multi-stage content screening pipeline
- Innate scan: pattern-based detection (prompt injection, jailbreak, XSS blocklists)
- Release gate: risk-proportionate gating (pass/quarantine/reject)
- Wired into
vault.add(): content screened before indexing - Quarantined resources get
ResourceStatus.QUARANTINED
- New CLI commands:
vault health,vault list,vault delete,vault transition,vault expiring vault.add_batch(sources)for bulk import- PostgreSQL schema parity:
adversarial_status,tenant_id,provenancetable, missing indexes
0.8.0 - 2026-04-06
- Encryption at rest:
AESGCMEncryptorclass (AES-256-GCM, FIPS 197). Install:pip install qp-vault[encryption] - Built-in embedding providers:
NoopEmbedderfor explicit text-only searchSentenceTransformerEmbedderfor local/air-gap embedding (pip install qp-vault[local])OpenAIEmbedderfor cloud embedding (pip install qp-vault[openai])
- Docling parser: 25+ format document processing (PDF, DOCX, PPTX, etc.). Install:
pip install qp-vault[docling] PluginRegistry.fire_hooks()— plugin lifecycle hooks are now invoked[local]and[openai]installation extras
- README updated: encryption and docling marked as delivered (were "planned")
0.7.0 - 2026-04-06
- Multi-tenancy:
tenant_idparameter onadd(),list(),search(), and all public methods tenant_idcolumn in SQLite and PostgreSQL storage schemas with index- Tenant-scoped search: queries filter by
tenant_idwhen provided vault.create_collection()andvault.list_collections()— Collection CRUD- Auto-detection of qp-capsule: if installed,
CapsuleAuditoris used automatically (no manual wiring)
0.6.0 - 2026-04-06
vault.get_content(resource_id)— retrieve full text content (reassembles chunks)vault.replace(resource_id, new_content)— atomic content replacement with auto-supersessionvault.get_provenance(resource_id)— retrieve provenance records for a resourcevault.set_adversarial_status(resource_id, status)— persist adversarial verification statusadversarial_statuscolumn in storage schemas (persisted, was RAM-only)provenancetable in storage schemas (persisted, was RAM-only)updated_at,resource_type,data_classificationfields onSearchResultmodel- Layer
search_boostapplied in ranking (OPERATIONAL 1.5x, STRATEGIC 1.0x)
- Freshness decay: was hardcoded to 1.0, now computed from
updated_atwith per-tier half-life - Layer search_boost: defined per layer but never applied in
apply_trust_weighting()
- README badges corrected: removed undelivered encryption/FIPS claims, fixed test count
- Encryption (
[encryption]) and docling ([docling]) extras marked as "planned v0.8"
0.5.0 - 2026-04-06
- Plugin system with
@embedder,@parser,@policydecorators - Air-gap plugin loading via
--plugins-dir(drop .py files) - Entry point discovery for installed plugin packages
- FastAPI routes via
create_vault_router()([fastapi]extra) - All REST endpoints: resources CRUD, search, verify, health, lifecycle, proof
0.4.0 - 2026-04-06
- Memory layers: OPERATIONAL, STRATEGIC, COMPLIANCE with per-layer defaults
vault.layer(MemoryLayer.OPERATIONAL)returns scoped LayerView- COMPLIANCE layer audits every read operation
- Integrity detection: staleness scoring, duplicate detection, orphan detection
vault.health()composite score (0-100): coherence, freshness, uniqueness, connectivityvault.status()includeslayer_detailsbreakdown
0.3.0 - 2026-04-06
- Knowledge lifecycle state machine: DRAFT, REVIEW, ACTIVE, SUPERSEDED, EXPIRED, ARCHIVED
vault.transition(),vault.supersede(),vault.chain(),vault.expiring()- Temporal validity:
valid_from,valid_untilon resources vault.export_proof()for Merkle proof export (auditor-verifiable)- Supersession chain cycle protection (max_length=1000)
0.2.0 - 2026-04-06
vaultCLI tool: init, add, search, inspect, status, verify- Capsule audit integration (
[capsule]extra) - PostgreSQL + pgvector + pg_trgm storage backend (
[postgres]extra) - WebVTT and SRT transcript parsers with speaker attribution
Vault.from_postgres()andVault.from_config()factory methods
- FTS5 query sanitization (prevents injection via special characters)
- Parameterized SQL queries in PostgreSQL backend (no string interpolation)
0.1.0 - 2026-04-05
- Initial release
Vault(sync) andAsyncVault(async) main classes- 8 Pydantic domain models: Resource, Chunk, Collection, SearchResult, VaultEvent, VerificationResult, VaultVerificationResult, MerkleProof, HealthScore
- 10 enumerations: TrustTier, DataClassification, ResourceType, ResourceStatus, Lifecycle, MemoryLayer, EventType
- 5 Protocol interfaces: StorageBackend, EmbeddingProvider, AuditProvider, ParserProvider, PolicyProvider
- SQLite storage backend with FTS5 full-text search (zero-config default)
- Trust-weighted hybrid search:
relevance = (0.7 * vector + 0.3 * text) * trust_weight * freshness - SHA3-256 content-addressed storage (CID per chunk, Merkle root per resource)
- Semantic text chunker (token-aware, overlap, section detection)
- Built-in text parser (30+ file extensions, zero deps)
- JSON lines audit fallback (LogAuditor)
- VaultConfig with TOML loading
- Input validation: enum values, resource names, tags, metadata
- Path traversal protection (name sanitization, null byte stripping)
- Max file size enforcement (configurable)
- Content null byte stripping on ingest