Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,21 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [Unreleased]

## [0.8.0] - 2026-04-06

### Added
- **Encryption at rest**: `AESGCMEncryptor` class (AES-256-GCM, FIPS 197). Install: `pip install qp-vault[encryption]`
- **Built-in embedding providers**:
- `NoopEmbedder` for explicit text-only search
- `SentenceTransformerEmbedder` for local/air-gap embedding (`pip install qp-vault[local]`)
- `OpenAIEmbedder` for cloud embedding (`pip install qp-vault[openai]`)
- **Docling parser**: 25+ format document processing (PDF, DOCX, PPTX, etc.). Install: `pip install qp-vault[docling]`
- `PluginRegistry.fire_hooks()` — plugin lifecycle hooks are now invoked
- `[local]` and `[openai]` installation extras

### Changed
- README updated: encryption and docling marked as delivered (were "planned")

## [0.7.0] - 2026-04-06

### Added
Expand Down
8 changes: 5 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -137,8 +137,10 @@ pip install qp-vault
| `pip install qp-vault` | SQLite, trust search, CAS, Merkle, lifecycle | **1** (pydantic) |
| `pip install qp-vault[postgres]` | + PostgreSQL + pgvector hybrid search | + sqlalchemy, asyncpg, pgvector |
| `pip install qp-vault[capsule]` | + Cryptographic audit trail | + [qp-capsule](https://github.com/quantumpipes/capsule) |
| `pip install qp-vault[docling]` | + 25+ format document processing (planned v0.8) | + docling |
| `pip install qp-vault[encryption]` | + AES-256-GCM encryption at rest (planned v0.8) | + cryptography, pynacl |
| `pip install qp-vault[docling]` | + 25+ format document processing (PDF, DOCX, etc.) | + docling |
| `pip install qp-vault[encryption]` | + AES-256-GCM encryption at rest | + cryptography, pynacl |
| `pip install qp-vault[local]` | + Local embeddings (sentence-transformers, air-gap safe) | + sentence-transformers |
| `pip install qp-vault[openai]` | + OpenAI embeddings (cloud) | + openai |
| `pip install qp-vault[fastapi]` | + REST API (15+ endpoints) | + fastapi |
| `pip install qp-vault[cli]` | + `vault` command-line tool | + typer, rich |
| `pip install qp-vault[all]` | Everything | All of the above |
Expand Down Expand Up @@ -246,7 +248,7 @@ app.include_router(router, prefix="/v1/vault")
|---|---|---|---|
| Content integrity | SHA3-256 | FIPS 202 | Tamper-evident CIDs and Merkle roots |
| Audit signatures | Ed25519 + ML-DSA-65 | FIPS 186-5, FIPS 204 | Via [qp-capsule](https://github.com/quantumpipes/capsule) (optional) |
| Encryption at rest | AES-256-GCM (planned v0.8) | FIPS 197 | Post-quantum key exchange (planned) |
| Encryption at rest | AES-256-GCM | FIPS 197 | `pip install qp-vault[encryption]` |
| Search integrity | Parameterized SQL | -- | No string interpolation, FTS5 sanitized |
| Input validation | Pydantic + custom | -- | Enum checks, name/tag/metadata limits |

Expand Down
8 changes: 7 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ build-backend = "hatchling.build"

[project]
name = "qp-vault"
version = "0.7.0"
version = "0.8.0"
description = "Governed knowledge store for autonomous organizations. Trust tiers, cryptographic audit trails, content-addressed storage, air-gap native."
readme = "README.md"
license = "Apache-2.0"
Expand Down Expand Up @@ -55,6 +55,12 @@ postgres = [
docling = [
"docling>=2.73",
]
local = [
"sentence-transformers>=3.0",
]
openai = [
"openai>=1.0",
]
capsule = [
"qp-capsule>=1.5",
]
Expand Down
2 changes: 1 addition & 1 deletion src/qp_vault/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@
Docs: https://github.com/quantumpipes/vault
"""

__version__ = "0.7.0"
__version__ = "0.8.0"
__author__ = "Quantum Pipes Technologies, LLC"
__license__ = "Apache-2.0"

Expand Down
8 changes: 8 additions & 0 deletions src/qp_vault/embeddings/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# Copyright 2026 Quantum Pipes Technologies, LLC
# SPDX-License-Identifier: Apache-2.0

"""Built-in embedding providers for qp-vault."""

from qp_vault.embeddings.noop import NoopEmbedder

__all__ = ["NoopEmbedder"]
26 changes: 26 additions & 0 deletions src/qp_vault/embeddings/noop.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# Copyright 2026 Quantum Pipes Technologies, LLC
# SPDX-License-Identifier: Apache-2.0

"""Noop embedding provider: explicit text-only search.

Use this when you intentionally want text-only search (FTS5/pg_trgm)
without vector similarity. Makes the choice explicit instead of silent.
"""

from __future__ import annotations


class NoopEmbedder:
"""Embedding provider that returns zero vectors.

Makes text-only search an explicit choice. When used, the search
formula degrades to: relevance = text_rank * trust_weight * freshness.
"""

@property
def dimensions(self) -> int:
return 0

async def embed(self, texts: list[str]) -> list[list[float]]:
"""Return empty embeddings (text-only mode)."""
return [[] for _ in texts]
51 changes: 51 additions & 0 deletions src/qp_vault/embeddings/openai.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
# Copyright 2026 Quantum Pipes Technologies, LLC
# SPDX-License-Identifier: Apache-2.0

"""OpenAI embedding provider (cloud).

Requires: pip install openai
"""

from __future__ import annotations

try:
from openai import AsyncOpenAI
HAS_OPENAI = True
except ImportError:
HAS_OPENAI = False


class OpenAIEmbedder:
"""OpenAI text-embedding-3-small (1536 dimensions).

Requires OPENAI_API_KEY environment variable or explicit api_key.

Args:
model: OpenAI embedding model name.
api_key: Optional API key (defaults to OPENAI_API_KEY env var).
"""

def __init__(
self,
model: str = "text-embedding-3-small",
api_key: str | None = None,
) -> None:
if not HAS_OPENAI:
raise ImportError(
"openai is required. Install with: pip install openai"
)
self._model = model
self._client = AsyncOpenAI(api_key=api_key) if api_key else AsyncOpenAI()
self._dimensions = 1536 if "small" in model else 3072

@property
def dimensions(self) -> int:
return self._dimensions

async def embed(self, texts: list[str]) -> list[list[float]]:
"""Generate embeddings via OpenAI API."""
response = await self._client.embeddings.create(
model=self._model,
input=texts,
)
return [d.embedding for d in response.data]
44 changes: 44 additions & 0 deletions src/qp_vault/embeddings/sentence.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# Copyright 2026 Quantum Pipes Technologies, LLC
# SPDX-License-Identifier: Apache-2.0

"""Sentence Transformers embedding provider (local, air-gap safe).

Requires: pip install sentence-transformers
"""

from __future__ import annotations

try:
from sentence_transformers import SentenceTransformer
HAS_ST = True
except ImportError:
HAS_ST = False


class SentenceTransformerEmbedder:
"""Local embedding using sentence-transformers.

Default model: all-MiniLM-L6-v2 (384 dimensions, fast, good quality).
Air-gap safe: runs entirely on CPU, no internet after initial download.

Args:
model_name: HuggingFace model name. Default: all-MiniLM-L6-v2.
"""

def __init__(self, model_name: str = "all-MiniLM-L6-v2") -> None:
if not HAS_ST:
raise ImportError(
"sentence-transformers is required. "
"Install with: pip install sentence-transformers"
)
self._model = SentenceTransformer(model_name)
self._dimensions = self._model.get_sentence_embedding_dimension()

@property
def dimensions(self) -> int:
return self._dimensions # type: ignore[return-value]

async def embed(self, texts: list[str]) -> list[list[float]]:
"""Generate embeddings for a batch of texts."""
embeddings = self._model.encode(texts, convert_to_numpy=True)
return embeddings.tolist()
12 changes: 12 additions & 0 deletions src/qp_vault/encryption/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# Copyright 2026 Quantum Pipes Technologies, LLC
# SPDX-License-Identifier: Apache-2.0

"""Encryption at rest for qp-vault.

Provides AES-256-GCM symmetric encryption for chunk content.
Requires: pip install qp-vault[encryption]
"""

from qp_vault.encryption.aes_gcm import AESGCMEncryptor

__all__ = ["AESGCMEncryptor"]
95 changes: 95 additions & 0 deletions src/qp_vault/encryption/aes_gcm.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
# Copyright 2026 Quantum Pipes Technologies, LLC
# SPDX-License-Identifier: Apache-2.0

"""AES-256-GCM encryption for vault content.

Each encrypt call generates a unique nonce. Ciphertext format:
nonce (12 bytes) || ciphertext || tag (16 bytes)

Requires: pip install qp-vault[encryption]
"""

from __future__ import annotations

import os

try:
from cryptography.hazmat.primitives.ciphers.aead import AESGCM
HAS_CRYPTO = True
except ImportError:
HAS_CRYPTO = False


class AESGCMEncryptor:
"""AES-256-GCM symmetric encryption.

Args:
key: 32-byte encryption key. If None, generates a random key.

Usage:
enc = AESGCMEncryptor()
ciphertext = enc.encrypt(b"secret data")
plaintext = enc.decrypt(ciphertext)
"""

def __init__(self, key: bytes | None = None) -> None:
if not HAS_CRYPTO:
raise ImportError(
"cryptography is required for encryption. "
"Install with: pip install qp-vault[encryption]"
)
if key is None:
key = AESGCM.generate_key(bit_length=256)
if len(key) != 32:
raise ValueError("Key must be exactly 32 bytes (256 bits)")
self._key = key
self._aesgcm = AESGCM(key)

@property
def key(self) -> bytes:
"""The encryption key (32 bytes)."""
return self._key

def encrypt(self, plaintext: bytes, associated_data: bytes | None = None) -> bytes:
"""Encrypt data with AES-256-GCM.

Args:
plaintext: Data to encrypt.
associated_data: Optional authenticated but unencrypted data.

Returns:
nonce (12 bytes) || ciphertext || tag (16 bytes)
"""
nonce = os.urandom(12)
ciphertext = self._aesgcm.encrypt(nonce, plaintext, associated_data)
return nonce + ciphertext

def decrypt(self, data: bytes, associated_data: bytes | None = None) -> bytes:
"""Decrypt AES-256-GCM encrypted data.

Args:
data: nonce (12 bytes) || ciphertext || tag (16 bytes)
associated_data: Must match what was passed to encrypt().

Returns:
Decrypted plaintext.

Raises:
ValueError: If decryption fails (tampered data or wrong key).
"""
if len(data) < 28: # 12 nonce + 16 tag minimum
raise ValueError("Encrypted data too short")
nonce = data[:12]
ciphertext = data[12:]
try:
return self._aesgcm.decrypt(nonce, ciphertext, associated_data)
except Exception as e:
raise ValueError(f"Decryption failed: {e}") from e

def encrypt_text(self, text: str, associated_data: bytes | None = None) -> bytes:
"""Convenience: encrypt a UTF-8 string."""
return self.encrypt(text.encode("utf-8"), associated_data)

def decrypt_text(self, data: bytes, associated_data: bytes | None = None) -> str:
"""Convenience: decrypt to a UTF-8 string."""
return self.decrypt(data, associated_data).decode("utf-8")
12 changes: 12 additions & 0 deletions src/qp_vault/plugins/registry.py
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,18 @@ def register_hook(self, event: str, callback: Any) -> None:
"""Register a lifecycle hook callback."""
self._hooks.setdefault(event, []).append(callback)

async def fire_hooks(self, event: str, **kwargs: Any) -> None:
"""Invoke all registered hooks for an event."""
for callback in self._hooks.get(event, []):
try:
import asyncio
if asyncio.iscoroutinefunction(callback):
await callback(**kwargs)
else:
callback(**kwargs)
except Exception as e:
logger.warning("Hook %s failed: %s", event, e)

# --- Retrieval ---

def get_embedder(self, name: str) -> Any | None:
Expand Down
Loading
Loading