From 66498c6f16bf8c6c4eb805fdc3dec32d6486b2da Mon Sep 17 00:00:00 2001 From: "J. Kirby Ross" Date: Mon, 10 Nov 2025 05:43:17 -0800 Subject: [PATCH 01/25] refactor: Refine ADR-0004 documentation based on feedback This commit addresses 8 specific issues raised during review of the initial ADR-0004 documentation. The changes improve clarity, consistency, and technical precision across the specification. Fixes include: - **FEATURES.md**: - Rewrote acceptance criteria for F9-US-DEV to be specific and verifiable. - Removed the superseded F6 feature to avoid confusion with F9. - **SPEC.md**: - Corrected Mermaid diagram type casing ( -> ) to align with JSON standards. - Clarified that private blob decryption during pointer resolution is conditional, not assumed. - **TECH-SPEC.md**: - Removed redundant participant aliases in the projection sequence diagram. - Replaced the unconventional endpoint with a more RESTful URL. - Improved diagram precision by changing 'node' to 'field path'. - Specified JWS with a detached payload as the required authentication mechanism for pointer resolution. --- docs/FEATURES.md | 62 +++-- docs/SPEC.md | 60 +++-- docs/TECH-SPEC.md | 53 ++++- docs/USE-CASES.md | 10 + docs/decisions/ADR-0004/DECISION.md | 225 ++++++++++++++++++ docs/decisions/README.md | 1 + schemas/v1/privacy/opaque_pointer.schema.json | 46 ++++ 7 files changed, 407 insertions(+), 50 deletions(-) create mode 100644 docs/decisions/ADR-0004/DECISION.md create mode 100644 schemas/v1/privacy/opaque_pointer.schema.json diff --git a/docs/FEATURES.md b/docs/FEATURES.md index 9b655fc2..68068e7e 100644 --- a/docs/FEATURES.md +++ b/docs/FEATURES.md @@ -141,29 +141,6 @@ Each feature includes user stories per relevant stakeholders (format requested), --- -## F6 — Opaque Pointers & CAS - -### F6-US-DML - -| | | -|--|--| -| **As a...** | Data/ML Engineer | -| **I want..** | encrypted artifacts with verifiable pointers | -| **So that...** | I can ship models across untrusted storage | - -#### Acceptance Criteria - -- [ ] Pointer includes plaintext hash, ciphertext hash, cipher meta -- [ ] Rekey operation available - -#### Test Plan - -- [ ] Golden: decrypt with correct key → match plaintext hash -- [ ] Edge: wrong bytes → hash mismatch -- [ ] Failure: rekey without authorization → deny - ---- - ## F7 — Epochs & Compaction ### F7-US-PENG @@ -208,3 +185,42 @@ Each feature includes user stories per relevant stakeholders (format requested), - [ ] Golden: metrics show non-zero counters post workload - [ ] Edge: cache stale → doctor recommends rebuild - [ ] Failure: FF-only violation → doctor flags critical +--- + +## F9 — Hybrid Privacy Model + +See also: [ADR-0004](./decisions/ADR-0004/DECISION.md). + +### F9-US-DEV + +| | | +|--|--| +| **As a...** | App Developer | +| **I want..** | to store sensitive data (PII, secrets) in a private store | +| **So that...** | my public, verifiable state does not contain confidential information | + +#### Acceptance Criteria + +- [ ] Given a `policy.yaml` file with a rule to `pointerize` the path `sensitive.field`, when the state is folded, the resulting public state tree MUST replace the value of `sensitive.field` with a canonical Opaque Pointer. +- [ ] Given a `policy.yaml` file with a rule to `pointerize` a field, the public state MUST contain a canonical Opaque Pointer at the specified path, with its `digest`, `location`, and `capability` fields correctly populated. +- [ ] When the Client SDK attempts to resolve an Opaque Pointer using the specified `location` and `capability`, and the client possesses the necessary authorization, the SDK MUST successfully retrieve and decrypt the original private data. + +### F9-US-SEC + +| | | +|--|--| +| **As a...** | Security/Compliance | +| **I want..** | to audit the separation of public and private data | +| **So that...** | I can verify that sensitive data is properly isolated and access is controlled | + +#### Acceptance Criteria + +- [ ] Opaque Pointer resolution fails without a valid capability. +- [ ] Private blob digest matches the digest in the public pointer. +- [ ] Commit trailers accurately report the number of redactions/pointers. + +#### Test Plan + +- [ ] Golden: project a unified state, resolve pointer, and verify content matches original. +- [ ] Edge: attempt to resolve a pointer with an invalid capability URI → DENY. +- [ ] Failure: tamper with a private blob → digest mismatch on resolution. diff --git a/docs/SPEC.md b/docs/SPEC.md index 041cf0bd..b9cc5310 100644 --- a/docs/SPEC.md +++ b/docs/SPEC.md @@ -128,6 +128,7 @@ graph TD A1 --> B6(audit) A1 --> B7(cache) A1 --> B8(epoch) + A1 --> B9(private) C(notes) --> C1(gatos) end subgraph Workspace @@ -148,6 +149,8 @@ The normative layout is as follows: │ └── gatos/ │ ├── journal/ │ ├── state/ +│ ├── private/ +│ │ └── / │ ├── mbus/ │ ├── mbus-ack/ │ ├── jobs/ @@ -282,28 +285,57 @@ On **DENY**, the gate **MUST** append an audit decision to `refs/gatos/audit/pol --- -## 7. Blob Pointers & Opaque Storage +## 7. Privacy and Opaque Pointers -Large or sensitive data is stored out-of-band in a content-addressed store and referenced via pointers. +See also: [ADR‑0004](./decisions/ADR-0004/DECISION.md). + +GATOS supports a hybrid privacy model where state can be separated into a verifiable public projection and a confidential private overlay. This is achieved by applying a deterministic **Projection Functor** during the state fold process, which replaces sensitive or large data with **Opaque Pointers**. + +### 7.1 Projection Model + +The State Engine (`gatos-echo`) can be configured with privacy rules. When folding history, it first computes a `UnifiedState` containing all data. It then applies the privacy rules to produce a `PublicState` and a set of `PrivateBlobs`. + +- **`PublicState`**: Contains only public data and Opaque Pointers. This is committed to the public `refs/gatos/state/public/...` namespace and is globally verifiable. +- **`PrivateBlobs`**: The raw data that was redacted or pointerized. This data is stored in a separate, private store (e.g., a local directory, a private object store) and is addressed by its content hash. + +Any commit that is the result of a privacy projection **MUST** include trailers indicating the number of redactions and pointers created. + +```text +Privacy-Redactions: 5 +Privacy-Pointers: 2 +``` + +### 7.2 Opaque Pointers + +An Opaque Pointer is a canonical JSON object that acts as a verifiable, addressable link to a private blob. It replaces the sensitive data in the `PublicState`. ```mermaid classDiagram - class BlobPointer { - +String kind: "blobptr" - +String algo - +String hash - +Number size - } class OpaquePointer { - +String kind: "opaque" - +String algo - +String hash - +String ciphertext_hash - +Object cipher_meta + +string kind: "opaque_pointer" + +string algo: "blake3" + +string digest: "blake3:" + +number size + +string location + +string capability } ``` -Pointers **MUST** refer to bytes in `gatos/objects//`. For opaque objects, no plaintext **MAY** be stored in Git. +- `digest`: The **REQUIRED** `blake3` hash of the raw private data. This ensures the integrity of the private blob. +- `location`: A **REQUIRED** URI indicating where the blob can be fetched (e.g., `gatos-node://ed25519:`, `s3://...`). +- `capability`: A **REQUIRED** URI defining the auth/authz and decryption mechanism needed to access the blob (e.g., `gatos-key://...`, `kms://...`). + +The pointer itself is canonicalized and its `content_id` can be computed for verification purposes. + +### 7.3 Pointer Resolution + +A client resolving an Opaque Pointer **MUST** perform the following steps: +1. Fetch the private blob from the `location` URI, authenticating if required by the endpoint protocol. +2. Acquire the necessary authorization and/or decryption keys by interacting with the `capability` URI's system. +3. If the blob is encrypted, decrypt it. +4. Verify that the `blake3` hash of the resulting plaintext exactly matches the `digest` in the pointer. The resolution **MUST** fail if the hashes do not match. + +This process guarantees that even though the data is stored privately, its integrity is verifiable against the public ledger. --- diff --git a/docs/TECH-SPEC.md b/docs/TECH-SPEC.md index 4f991b54..0050740c 100644 --- a/docs/TECH-SPEC.md +++ b/docs/TECH-SPEC.md @@ -101,10 +101,10 @@ graph TD | `gatos-ledger-git` | `std`-dependent storage backend using `libgit2`. | | `gatos-ledger` | Composes ledger components via feature flags. | | `gatos-mind` | Asynchronous, commit-backed message bus (pub/sub). | -| `gatos-echo` | Deterministic state engine for processing events ("folds"). | -| `gatos-policy` | Deterministic policy engine for executing compiled rules and managing the Consensus Governance lifecycle. | +| `gatos-echo` | Deterministic state engine for processing events ("folds"). Privacy projection logic. | +| `gatos-policy` | Deterministic policy engine for executing compiled rules, managing Consensus Governance, and privacy rule evaluation. | | `gatos-kv` | Git-backed key-value state cache. | -| `gatosd` | Main binary for the CLI and the JSONL RPC daemon. | +| `gatosd` | Main binary for the CLI, JSONL RPC daemon, and Opaque Pointer resolution endpoint. | | `gatos-compute` | Worker that discovers and executes jobs from the Job Plane. | | `gatos-wasm-bindings`| WASM bindings for browser and Node.js environments. | | `gatos-ffi-bindings` | C-compatible FFI for integration with other languages. | @@ -160,22 +160,48 @@ sequenceDiagram --- -## 6. Opaque Pointers +## 6. Privacy Projection and Resolution -The `rekey` command allows updating the encryption key for an opaque blob. +See also: [ADR‑0004](./decisions/ADR-0004/DECISION.md). + +The implementation of the hybrid privacy model involves a coordinated effort between the state, policy, and daemon components. + +### 6.1 Projection Implementation + +The projection from a `UnifiedState` to a `PublicState` is handled by `gatos-echo` with rules supplied by `gatos-policy`. ```mermaid sequenceDiagram - participant User - participant GATOS - - User->>GATOS: gatos blob rekey --to - GATOS->>GATOS: Create new Opaque Pointer - GATOS->>GATOS: Encrypt data with new pubkey - GATOS->>GATOS: Store new ciphertext in CAS - GATOS->>GATOS: Atomically update references + participant gatos-echo + participant gatos-policy + participant gatos-ledger + participant PrivateStore + + Echo->>Echo: 1. Fold event history to produce UnifiedState + Echo->>Policy: 2. Request privacy rules for the current context + Policy-->>Echo: 3. Return `select` and `action` rules +loop for each field path in the UnifiedState tree + gatos-echo->>gatos-echo: 4. Match field path against rules + alt rule matches (e.g., "pointerize") + Echo->>Echo: 5. Generate Opaque Pointer envelope + Echo->>PrivateStore: 6. Store original node value as private blob, keyed by its blake3 digest + Echo->>Echo: 7. Replace node in state tree with pointer + end + end + Echo->>Ledger: 8. Commit the final PublicState tree ``` +The `PrivateStore` is a pluggable trait, allowing for backends like a local filesystem, S3, or another GATOS node. + +### 6.2 Resolution Implementation + +The `gatosd` daemon exposes a secure endpoint for resolving Opaque Pointers. + +- **Endpoint**: `gatosd` will listen for authenticated requests, for example at `/gatos/private/blobs/{digest}`. +- **Authentication**: The client SDK **MUST** send a `Authorization` header containing a JSON Web Signature (JWS) with a detached payload. The JWS payload **MUST** be the BLAKE3 hash of the request body. `gatosd` verifies the signature against the actor's public key. +- **Authorization**: Upon receiving a valid request, `gatosd` queries `gatos-policy` to determine if the requesting actor has the capability to access the blob identified by `{digest}`. +- **Response**: If authorized, `gatosd` fetches the (likely encrypted) blob from its configured `PrivateStore` and returns it to the client. The client is then responsible for decryption via the `capability` URI. + --- ## 7. JSONL Protocol @@ -236,6 +262,7 @@ graph TD C --> C1(Golden Vectors); C --> C2(Torture Tests); C --> C3(Reconcile Harness); + C --> C4(Projection Determinism); ``` --- diff --git a/docs/USE-CASES.md b/docs/USE-CASES.md index f3c52cb7..26e7100c 100644 --- a/docs/USE-CASES.md +++ b/docs/USE-CASES.md @@ -83,3 +83,13 @@ This document illustrates practical scenarios where GATOS provides unique value. |**Goal** | Signed toggles with audit and rollbacks. | | **How** | KV‑style events + index refs; push‑gate for enforcement. | | **Why GATOS** | Auditable configuration without a new database. | + +--- + +## 9) Verifiable, Compliant PII Management + +| | | +|---|---| +|**Goal** | Manage customer data (PII) in a way that is both auditable and privacy-preserving. | +| **How** | A privacy policy projects the unified state into a public state with PII replaced by Opaque Pointers. The private data lives in an actor-anchored, encrypted blob store. | +| **Why GATOS** | Provides a verifiable public audit trail ("a user's data was accessed") without ever exposing the private data ("the user's address is...") to the public ledger. Access is gated by cryptographic capabilities. | \ No newline at end of file diff --git a/docs/decisions/ADR-0004/DECISION.md b/docs/decisions/ADR-0004/DECISION.md new file mode 100644 index 00000000..501016e7 --- /dev/null +++ b/docs/decisions/ADR-0004/DECISION.md @@ -0,0 +1,225 @@ +--- +Status: Accepted +Date: 2025-11-10 +ADR: ADR-0004 +Authors: [flyingrobots, gemini-agent] +Requires: [ADR-0001] +Related: [ADR-0002, ADR-0003] +Tags: [Privacy, Projection, Opaque Pointers, Morphology Calculus] +Schemas: + - schemas/v1/privacy/opaque_pointer.schema.json +--- + +# ADR‑0004: Hybrid Privacy Model (Public Projection + Private Overlay) + +## Scope + +This ADR defines a **hybrid privacy model** for the GATOS operating surface. It formalizes the separation of state into a public, verifiable component and a private, actor-anchored overlay. This is achieved by introducing a **Projection Functor** that transforms a unified state into a public projection, leaving sensitive data in a private store referenced by **Opaque Pointers**. + +## Rationale + +GATOS's core value proposition is its verifiable, deterministic public ledger. However, many real-world applications require storing sensitive or large data (PII, secrets, large binaries) without committing it to the public history. The previous ad-hoc approach of using local, out-of-repo storage lacks the formal guarantees required by the GATOS Morphology Calculus. + +This ADR makes the hybrid model **normative, deterministic, and provable**. It ensures that public state remains globally verifiable while private data is securely addressable, auditable, and tied to the GATOS identity and policy model. + +## Mathematical Foundation (Morphology Calculus) + +This model is a direct application of the GATOS Morphology Calculus. + +1. **Shape Categories**: We define three categories of shapes: + * `Sh_Unified`: The category of shapes containing both public and private data. + * `Sh_Public`: The category of shapes containing only public data and opaque pointers. + * `Sh_Private`: The category of shapes containing only the private data blobs. + +2. **Projection as a Functor**: The privacy model is implemented as a functor, `Proj`, which maps shapes and morphisms from the unified category to the public category. + `Proj: Sh_Unified -> Sh_Public` + + This functor applies the privacy policy rules (`redact`, `pointerize`) to transform a unified shape into its public projection. The private data is extracted into `Sh_Private` during this process. + + ```mermaid + graph TD + subgraph Sh_Unified + U1("Unified Shape 1") + U2("Unified Shape 2") + U1 -- "Commit c" --> U2 + end + + subgraph Sh_Public + P1("Public Shape 1") + P2("Public Shape 2") + P1 -- "Proj(c)" --> P2 + end + + subgraph Sh_Private + B1("Private Blobs 1") + B2("Private Blobs 2") + end + + U1 -- "Proj" --> P1 + U2 -- "Proj" --> P2 + + U1 -- "Extract" --> B1 + U2 -- "Extract" --> B2 + + style P1 fill:#cde,stroke:#333 + style P2 fill:#cde,stroke:#333 + ``` + +This ensures that the transformation is structure-preserving and that the public history remains a valid, deterministic projection of the complete history. + +## Decision + +### 1. Actor-Anchored Private Namespace (Normative) + +Private data overlays are fundamentally tied to an actor's identity, not an ephemeral session. This anchors private data within the GATOS trust graph. + +- **Actor ID:** The canonical identifier for an actor, e.g., `ed25519:`. +- **Private Refs:** Private data is stored under refs namespaced by the actor ID. + ``` + refs/gatos/private/// + ``` +- **Public Refs:** The corresponding public projection lives in the main state namespace. + ``` + refs/gatos/state/public// + ``` + +### 2. Opaque Pointers (Normative) + +When private data is elided from the `PublicState`, a canonical JSON **Opaque Pointer** envelope is inserted in its place. + +```mermaid +classDiagram + class OpaquePointer { + +String kind: "opaque_pointer" + +String algo: "blake3" + +String digest: "blake3:" + +Number size + +String location + +String capability + } +``` + +- **`digest`**: The content-address of the private blob (`blake3(private_bytes)`). This is the immutable link between the public and private worlds. +- **`location`**: A URI indicating where to resolve the blob. Supported schemes include: + - `gatos-node://ed25519:`: Resolve via the GATOS trust graph. + - `https://...`, `s3://...`, `ipfs://...`: Standard distributed storage. + - `file:///...`: For local development and testing. +- **`capability`**: A URI defining the authorization and decryption mechanism required to access the blob. + - `gatos-key://v1/aes-256-gcm/`: A symmetric key managed by a GATOS-aware key service. + - `kms://...`, `age://...`, `sops://...`: Integration with standard secret management tools. + +The canonical `content_id` of the pointer itself is `blake3(canonical_json_bytes)`. + +**Schema:** `schemas/v1/privacy/opaque_pointer.schema.json` + +### 3. The Projection Function (Normative) + +The State Engine (`gatos-echo`) is responsible for executing the projection. + +1. It computes a **UnifiedState** by folding the complete event history. +2. It consults the **Privacy Policy** (`.gatos/policy.yaml`). +3. It traverses the `UnifiedState` tree, applying `redact` or `pointerize` rules. + - `redact`: The field is removed from the public state. + - `pointerize`: The field's value is stored as a private blob, and an Opaque Pointer is substituted in the public state. +4. The resulting `PublicState` is committed to the public refs, and the `Private Blobs` are persisted to their specified `location`. + +```mermaid +sequenceDiagram + participant E as State Engine (gatos-echo) + participant Pol as Policy Engine + participant L as Ledger (Git) + participant PS as Private Store + + E->>E: 1. Fold history into UnifiedState + E->>Pol: 2. Fetch privacy rules + Pol-->>E: 3. Return rules (redact/pointerize) + E->>E: 4. Apply rules to create PublicState + PrivateBlobs + E->>L: 5. Commit PublicState to public refs + E->>PS: 6. Store PrivateBlobs by digest +``` + +### 4. Pointer Resolution Protocol (Normative) + +A client resolving an Opaque Pointer **MUST** follow this protocol: + +1. **Parse Pointer**: Extract `digest`, `location`, and `capability`. +2. **Fetch Blob**: + - If `gatos-node://`, resolve the actor's endpoint from the trust graph. + - The client **MUST** send an authenticated request to the node (e.g., with a JWT or a signed challenge). + - The node's endpoint (e.g., `GET /.well-known/gatos/private/{digest}`) **MUST** verify the client's authorization against its policy before returning the blob. +3. **Acquire Capability**: + - Parse the `capability` URI. + - Interact with the specified system (KMS, key server) to get the decryption key. This step will have its own auth/authz protocol. +4. **Decrypt and Verify**: + - Decrypt the fetched blob using the key. + - Compute `blake3(decrypted_bytes)`. + - The operation **MUST FAIL** if the computed hash does not exactly match the `digest` in the pointer. + +```mermaid +sequenceDiagram + participant C as Client + participant PN as Private GATOS Node + participant KMS as Key Management Service + + C->>C: 1. Read OpaquePointer + C->>PN: 2. GET /private/{digest} (Authenticated) + PN->>PN: 3. Check policy (is C allowed?) + alt Authorized + PN-->>C: 4. Return encrypted blob + C->>KMS: 5. Request key for {capability} + KMS-->>C: 6. Return decryption key + C->>C: 7. Decrypt blob + C->>C: 8. Verify blake3(decrypted) == digest + else Unauthorized + PN-->>C: 4. Return 403 Forbidden + end +``` + +### 5. Policy Hooks (Normative) + +The privacy policy is defined in `.gatos/policy.yaml` and extends the policy engine's domain. + +```yaml +privacy: + rules: + - select: "path.to.sensitive.data" + action: "pointerize" + capability: "gatos-key://v1/aes-256-gcm/ops-key-01" + location: "gatos-node://ed25519:" + - select: "path.to.transient.data" + action: "redact" +``` + +The `select` syntax will use a simple path-matching language (e.g., glob patterns) defined by the policy engine. + +### 6. Auditability and Trailers (Normative) + +To make privacy operations transparent and auditable, any commit that creates a `PublicState` from a projection **MUST** include the following trailers: + +``` +Privacy-Redactions: 3 +Privacy-Pointers: 12 +``` + +This provides a simple, top-level indicator that a projection has occurred, prompting auditors to look deeper if necessary. + +## Consequences + +### Pros + +- **Provable Privacy**: The model is grounded in the Morphology Calculus, making it verifiable. +- **Decoupled Storage**: Private data can live in any storage system (S3, IPFS, local disk) without affecting the public ledger's logic. +- **Integrated Auth/Authz**: By tying pointers to actor identities and capabilities, access to private data is governed by the existing GATOS trust and policy model. +- **Preserves Verifiability**: The `PublicState` remains globally verifiable, as pointers are just content-addressed links. + +### Cons + +- **Increased Complexity**: Resolution requires network requests and interaction with key management systems, adding latency and potential points of failure. +- **Operational Overhead**: Operators must manage the private blob stores and ensure their availability and security. + +## Feature Payoff + +- **Secure PII/Secret Storage**: Store sensitive data off-chain while retaining an auditable link to it. +- **Large Artifact Management**: Handle large binaries (ML models, videos) without bloating the Git repository. +- **Compliant Data Sharing**: Share a public, redacted dataset with third parties while retaining private access to the full, unified view. +- **Federated Learning**: Different actors can hold private models locally, referenced by pointers in a public "training plan" shape. diff --git a/docs/decisions/README.md b/docs/decisions/README.md index 94b0994e..b61db3c4 100644 --- a/docs/decisions/README.md +++ b/docs/decisions/README.md @@ -20,3 +20,4 @@ Each ADR will have a status, typically one of the following: | [ADR-0001](./ADR-0001/DECISION.md) | Split gatos-ledger into no_std Core and std Backends | Accepted | 2025-11-08 | | [ADR-0002](./ADR-0002/DECISION.md) | Distributed Compute via a Job Plane | Accepted | 2025-11-08 | | [ADR-0003](./ADR-0003/DECISION.md) | Consensus Governance for Gated Actions | Accepted | 2025-11-08 | +| [ADR-0004](./ADR-0004/DECISION.md) | Hybrid Privacy Model (Public Projection + Private Overlay) | Accepted | 2025-11-10 | diff --git a/schemas/v1/privacy/opaque_pointer.schema.json b/schemas/v1/privacy/opaque_pointer.schema.json new file mode 100644 index 00000000..78a4d61b --- /dev/null +++ b/schemas/v1/privacy/opaque_pointer.schema.json @@ -0,0 +1,46 @@ +{ + "$schema": "http://json-schema.org/draft-07/schema#", + "title": "GATOS Opaque Pointer", + "description": "A canonical pointer to a private data blob, used to replace sensitive or large data in a public state projection.", + "type": "object", + "properties": { + "kind": { + "description": "The object kind, MUST be 'opaque_pointer'.", + "type": "string", + "const": "opaque_pointer" + }, + "algo": { + "description": "The hashing algorithm used for the digest, MUST be 'blake3'.", + "type": "string", + "const": "blake3" + }, + "digest": { + "description": "The content-address of the private blob, prefixed with the algorithm.", + "type": "string", + "pattern": "^blake3:[a-f0-9]{64}$" + }, + "size": { + "description": "Optional: The size of the private blob in bytes.", + "type": "integer", + "minimum": 0 + }, + "location": { + "description": "A URI indicating where the private blob can be resolved.", + "type": "string", + "format": "uri" + }, + "capability": { + "description": "A URI defining the authorization and/or decryption mechanism for the blob.", + "type": "string", + "format": "uri" + } + }, + "required": [ + "kind", + "algo", + "digest", + "location", + "capability" + ], + "additionalProperties": false +} From 46ec339ffa46af25bc00c04741a68be0759d9569 Mon Sep 17 00:00:00 2001 From: "J. Kirby Ross" Date: Mon, 10 Nov 2025 05:55:53 -0800 Subject: [PATCH 02/25] docs: Clarify commit trailer format in F9-US-SEC acceptance criteria Updated the acceptance criteria for F9-US-SEC in FEATURES.md to explicitly mention the and commit trailers, improving clarity for developers. --- docs/FEATURES.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/FEATURES.md b/docs/FEATURES.md index 68068e7e..28cf4d7b 100644 --- a/docs/FEATURES.md +++ b/docs/FEATURES.md @@ -217,7 +217,7 @@ See also: [ADR-0004](./decisions/ADR-0004/DECISION.md). - [ ] Opaque Pointer resolution fails without a valid capability. - [ ] Private blob digest matches the digest in the public pointer. -- [ ] Commit trailers accurately report the number of redactions/pointers. +- [ ] Commit trailers (`Privacy-Redactions`, `Privacy-Pointers`) accurately report the number of redactions/pointers. #### Test Plan From 676b6fa02a194e27f162425510502bc999335d22 Mon Sep 17 00:00:00 2001 From: "J. Kirby Ross" Date: Mon, 10 Nov 2025 05:58:20 -0800 Subject: [PATCH 03/25] docs: Clarify actor-id format in SPEC.md --- docs/SPEC.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/SPEC.md b/docs/SPEC.md index b9cc5310..f0c5b079 100644 --- a/docs/SPEC.md +++ b/docs/SPEC.md @@ -150,7 +150,7 @@ The normative layout is as follows: │ ├── journal/ │ ├── state/ │ ├── private/ -│ │ └── / +│ │ └── / # e.g., the actor's ed25519 public key │ ├── mbus/ │ ├── mbus-ack/ │ ├── jobs/ From 7c35665b2d7e06b74e34e2176eadf7639aa6da87 Mon Sep 17 00:00:00 2001 From: "J. Kirby Ross" Date: Mon, 10 Nov 2025 06:00:34 -0800 Subject: [PATCH 04/25] docs: Clarify size unit for OpaquePointer in SPEC.md --- docs/SPEC.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/SPEC.md b/docs/SPEC.md index f0c5b079..4379fd12 100644 --- a/docs/SPEC.md +++ b/docs/SPEC.md @@ -321,7 +321,7 @@ classDiagram } ``` -- `digest`: The **REQUIRED** `blake3` hash of the raw private data. This ensures the integrity of the private blob. +- `size`: The size of the private blob in bytes. - `location`: A **REQUIRED** URI indicating where the blob can be fetched (e.g., `gatos-node://ed25519:`, `s3://...`). - `capability`: A **REQUIRED** URI defining the auth/authz and decryption mechanism needed to access the blob (e.g., `gatos-key://...`, `kms://...`). From fce8bb1e11d666d479bd21df90b97dc261592461 Mon Sep 17 00:00:00 2001 From: "J. Kirby Ross" Date: Mon, 10 Nov 2025 06:01:30 -0800 Subject: [PATCH 05/25] docs: Clarify error handling for digest mismatch in SPEC.md Pointer Resolution --- docs/SPEC.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/SPEC.md b/docs/SPEC.md index 4379fd12..cc49e685 100644 --- a/docs/SPEC.md +++ b/docs/SPEC.md @@ -333,7 +333,7 @@ A client resolving an Opaque Pointer **MUST** perform the following steps: 1. Fetch the private blob from the `location` URI, authenticating if required by the endpoint protocol. 2. Acquire the necessary authorization and/or decryption keys by interacting with the `capability` URI's system. 3. If the blob is encrypted, decrypt it. -4. Verify that the `blake3` hash of the resulting plaintext exactly matches the `digest` in the pointer. The resolution **MUST** fail if the hashes do not match. +4. Verify that the `blake3` hash of the resulting plaintext exactly matches the `digest` in the pointer. If the hashes do not match, the resolution **MUST** fail with a `DigestMismatch` error, and the client **SHOULD** log a security warning, as this may indicate data tampering. This process guarantees that even though the data is stored privately, its integrity is verifiable against the public ledger. From 3d6bcbdac9b48caae4d4a05eaa6d6717f85481bc Mon Sep 17 00:00:00 2001 From: "J. Kirby Ross" Date: Mon, 10 Nov 2025 06:02:25 -0800 Subject: [PATCH 06/25] docs: Clarify PrivateStore participant as interface in TECH-SPEC.md diagram --- docs/TECH-SPEC.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/TECH-SPEC.md b/docs/TECH-SPEC.md index 0050740c..716ffef0 100644 --- a/docs/TECH-SPEC.md +++ b/docs/TECH-SPEC.md @@ -175,7 +175,7 @@ sequenceDiagram participant gatos-echo participant gatos-policy participant gatos-ledger - participant PrivateStore + participant "PrivateStore (Interface)" as "Storage Backend" Echo->>Echo: 1. Fold event history to produce UnifiedState Echo->>Policy: 2. Request privacy rules for the current context From 935ee36ac4c64181c1837c6287577d878ab628df Mon Sep 17 00:00:00 2001 From: "J. Kirby Ross" Date: Mon, 10 Nov 2025 06:02:55 -0800 Subject: [PATCH 07/25] docs: Add details for Projection Determinism test suite in TECH-SPEC.md --- docs/TECH-SPEC.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/docs/TECH-SPEC.md b/docs/TECH-SPEC.md index 716ffef0..b99236af 100644 --- a/docs/TECH-SPEC.md +++ b/docs/TECH-SPEC.md @@ -265,6 +265,8 @@ graph TD C --> C4(Projection Determinism); ``` +- **Projection Determinism**: Verifies that applying the same privacy policy to the same `UnifiedState` on different platforms (Linux, macOS, Windows) produces a byte-for-byte identical `PublicState` and the same set of private blobs. + --- ## 10. Security From f643aead23916f5c1dabc6423d8c288e3b3be6a2 Mon Sep 17 00:00:00 2001 From: "J. Kirby Ross" Date: Mon, 10 Nov 2025 06:03:23 -0800 Subject: [PATCH 08/25] docs: Update F9-US-DEV user story to BDD format in FEATURES.md --- docs/FEATURES.md | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/docs/FEATURES.md b/docs/FEATURES.md index 28cf4d7b..96ef4945 100644 --- a/docs/FEATURES.md +++ b/docs/FEATURES.md @@ -193,11 +193,9 @@ See also: [ADR-0004](./decisions/ADR-0004/DECISION.md). ### F9-US-DEV -| | | -|--|--| -| **As a...** | App Developer | -| **I want..** | to store sensitive data (PII, secrets) in a private store | -| **So that...** | my public, verifiable state does not contain confidential information | +**Given** I am an App Developer +**When** I define a piece of state as sensitive in my policy +**Then** that state should be stored in a private store and replaced with an Opaque Pointer in the public state. #### Acceptance Criteria From b39373cc519fca2cf003be51b44aefc687d6239a Mon Sep 17 00:00:00 2001 From: "J. Kirby Ross" Date: Mon, 10 Nov 2025 06:03:38 -0800 Subject: [PATCH 09/25] docs: Mark gatos-compute as planned in SPEC.md system diagram --- docs/SPEC.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/SPEC.md b/docs/SPEC.md index cc49e685..1f86ce4d 100644 --- a/docs/SPEC.md +++ b/docs/SPEC.md @@ -71,7 +71,7 @@ graph TD end subgraph "Job Plane" - Compute("gatos-compute"); + Compute("gatos-compute (planned)"); end subgraph "Ledger Plane" From 412b9b91083f4d3d82d28979876630e61b47ea36 Mon Sep 17 00:00:00 2001 From: "J. Kirby Ross" Date: Mon, 10 Nov 2025 06:03:57 -0800 Subject: [PATCH 10/25] docs: Change PoC envelope storage requirement from SHOULD to MUST in SPEC.md --- docs/SPEC.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/SPEC.md b/docs/SPEC.md index 1f86ce4d..0bd47dfa 100644 --- a/docs/SPEC.md +++ b/docs/SPEC.md @@ -675,7 +675,7 @@ Proposal → Approvals (N‑of‑M) → Grant. Quorum groups (e.g., `@leads`) MU - A sorted list (by `Signer`) of all valid approvals used to reach quorum (by value or `Approval-Id`). - The governance rule id (`Policy-Rule`) and effective quorum parameters. -PoC envelope SHOULD be stored canonically under `refs/gatos/audit/proofs/governance/`; the Grant’s `Proof-Of-Consensus` trailer MUST equal `blake3(envelope_bytes)`. +PoC envelope MUST be stored canonically under `refs/gatos/audit/proofs/governance/`; the Grant’s `Proof-Of-Consensus` trailer MUST equal `blake3(envelope_bytes)`. ### 20.4 Lifecycle States From c63785ee06c7dd6f46f1c697b3e49d383f62cdd7 Mon Sep 17 00:00:00 2001 From: "J. Kirby Ross" Date: Mon, 10 Nov 2025 06:04:31 -0800 Subject: [PATCH 11/25] docs: Clarify purpose of gatos-kv crate in TECH-SPEC.md --- docs/TECH-SPEC.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/TECH-SPEC.md b/docs/TECH-SPEC.md index b99236af..31a359bc 100644 --- a/docs/TECH-SPEC.md +++ b/docs/TECH-SPEC.md @@ -103,7 +103,7 @@ graph TD | `gatos-mind` | Asynchronous, commit-backed message bus (pub/sub). | | `gatos-echo` | Deterministic state engine for processing events ("folds"). Privacy projection logic. | | `gatos-policy` | Deterministic policy engine for executing compiled rules, managing Consensus Governance, and privacy rule evaluation. | -| `gatos-kv` | Git-backed key-value state cache. | +| `gatos-kv` | Git-backed key-value state cache, used for materializing and indexing queryable views of folded state. | | `gatosd` | Main binary for the CLI, JSONL RPC daemon, and Opaque Pointer resolution endpoint. | | `gatos-compute` | Worker that discovers and executes jobs from the Job Plane. | | `gatos-wasm-bindings`| WASM bindings for browser and Node.js environments. | From 1ca5ec6b2a0c70da087e324be116558407f03531 Mon Sep 17 00:00:00 2001 From: "J. Kirby Ross" Date: Mon, 10 Nov 2025 06:04:46 -0800 Subject: [PATCH 12/25] docs: Mark chart data as illustrative in TECH-SPEC.md Performance Guidance --- docs/TECH-SPEC.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/TECH-SPEC.md b/docs/TECH-SPEC.md index 31a359bc..58947d41 100644 --- a/docs/TECH-SPEC.md +++ b/docs/TECH-SPEC.md @@ -310,7 +310,7 @@ Tuning batch size is a trade-off between latency and commit churn. ```mermaid xychart-beta - title "Batch Size Trade-off" + title "Batch Size Trade-off (Illustrative)" x-axis "Batch Size" y-axis "Metric" line "Latency" [50, 40, 35, 32, 30] From b5235c41de09ab328fab0446f5dffeb5c6c728b6 Mon Sep 17 00:00:00 2001 From: "J. Kirby Ross" Date: Mon, 10 Nov 2025 06:10:23 -0800 Subject: [PATCH 13/25] docs: Refine F9-US-DEV acceptance criteria to be more granular --- docs/FEATURES.md | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/docs/FEATURES.md b/docs/FEATURES.md index 96ef4945..7bb9c3b6 100644 --- a/docs/FEATURES.md +++ b/docs/FEATURES.md @@ -199,9 +199,11 @@ See also: [ADR-0004](./decisions/ADR-0004/DECISION.md). #### Acceptance Criteria -- [ ] Given a `policy.yaml` file with a rule to `pointerize` the path `sensitive.field`, when the state is folded, the resulting public state tree MUST replace the value of `sensitive.field` with a canonical Opaque Pointer. -- [ ] Given a `policy.yaml` file with a rule to `pointerize` a field, the public state MUST contain a canonical Opaque Pointer at the specified path, with its `digest`, `location`, and `capability` fields correctly populated. -- [ ] When the Client SDK attempts to resolve an Opaque Pointer using the specified `location` and `capability`, and the client possesses the necessary authorization, the SDK MUST successfully retrieve and decrypt the original private data. +- [ ] Given a `policy.yaml` with a rule to `pointerize` the path `sensitive.field`, when the state is folded, the public state tree MUST NOT contain the original value of `sensitive.field`. +- [ ] Given the same scenario, the public state tree MUST contain a canonical Opaque Pointer object at the `sensitive.field` path. +- [ ] The generated Opaque Pointer's `digest` field MUST match the BLAKE3 hash of the original, private value. +- [ ] The generated Opaque Pointer's `location` and `capability` fields MUST match the values specified in the `policy.yaml` rule. +- [ ] When the Client SDK resolves the pointer with correct authorization, the returned data MUST be byte-for-byte identical to the original `sensitive.field` value. ### F9-US-SEC From 8805e1b3b8776a2718950b8dc80b4c645d6319a4 Mon Sep 17 00:00:00 2001 From: "J. Kirby Ross" Date: Mon, 10 Nov 2025 06:10:44 -0800 Subject: [PATCH 14/25] docs: Simplify PrivateStore participant name in TECH-SPEC.md diagram --- docs/TECH-SPEC.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/TECH-SPEC.md b/docs/TECH-SPEC.md index 58947d41..8f927467 100644 --- a/docs/TECH-SPEC.md +++ b/docs/TECH-SPEC.md @@ -175,7 +175,7 @@ sequenceDiagram participant gatos-echo participant gatos-policy participant gatos-ledger - participant "PrivateStore (Interface)" as "Storage Backend" + participant "StorageBackend (Interface)" Echo->>Echo: 1. Fold event history to produce UnifiedState Echo->>Policy: 2. Request privacy rules for the current context From cc8d49e0e6f414dd898d728db9b8bbdf74daf73d Mon Sep 17 00:00:00 2001 From: "J. Kirby Ross" Date: Mon, 10 Nov 2025 06:11:57 -0800 Subject: [PATCH 15/25] docs: Remove redundant acceptance criteria for F9-US-DEV --- docs/FEATURES.md | 8 -------- 1 file changed, 8 deletions(-) diff --git a/docs/FEATURES.md b/docs/FEATURES.md index 7bb9c3b6..d5dd607d 100644 --- a/docs/FEATURES.md +++ b/docs/FEATURES.md @@ -197,14 +197,6 @@ See also: [ADR-0004](./decisions/ADR-0004/DECISION.md). **When** I define a piece of state as sensitive in my policy **Then** that state should be stored in a private store and replaced with an Opaque Pointer in the public state. -#### Acceptance Criteria - -- [ ] Given a `policy.yaml` with a rule to `pointerize` the path `sensitive.field`, when the state is folded, the public state tree MUST NOT contain the original value of `sensitive.field`. -- [ ] Given the same scenario, the public state tree MUST contain a canonical Opaque Pointer object at the `sensitive.field` path. -- [ ] The generated Opaque Pointer's `digest` field MUST match the BLAKE3 hash of the original, private value. -- [ ] The generated Opaque Pointer's `location` and `capability` fields MUST match the values specified in the `policy.yaml` rule. -- [ ] When the Client SDK resolves the pointer with correct authorization, the returned data MUST be byte-for-byte identical to the original `sensitive.field` value. - ### F9-US-SEC | | | From f1e6f8b22a19902e0c6db7001ffa0e22a27a9d76 Mon Sep 17 00:00:00 2001 From: "J. Kirby Ross" Date: Mon, 10 Nov 2025 06:12:14 -0800 Subject: [PATCH 16/25] docs: Add missing field description for digest in OpaquePointer --- docs/SPEC.md | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/SPEC.md b/docs/SPEC.md index 0bd47dfa..b9ff073f 100644 --- a/docs/SPEC.md +++ b/docs/SPEC.md @@ -321,6 +321,7 @@ classDiagram } ``` +- `digest`: The **REQUIRED** `blake3` hash of the raw private data. This ensures the integrity of the private blob. - `size`: The size of the private blob in bytes. - `location`: A **REQUIRED** URI indicating where the blob can be fetched (e.g., `gatos-node://ed25519:`, `s3://...`). - `capability`: A **REQUIRED** URI defining the auth/authz and decryption mechanism needed to access the blob (e.g., `gatos-key://...`, `kms://...`). From 8cc609110e656d5c5ffe32e7909aeaf665edf755 Mon Sep 17 00:00:00 2001 From: "J. Kirby Ross" Date: Mon, 10 Nov 2025 06:12:30 -0800 Subject: [PATCH 17/25] docs: Clarify interaction of expiration dates in Consensus Governance --- docs/SPEC.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/SPEC.md b/docs/SPEC.md index b9ff073f..8ced39ae 100644 --- a/docs/SPEC.md +++ b/docs/SPEC.md @@ -657,7 +657,7 @@ Proposal → Approvals (N‑of‑M) → Grant. Quorum groups (e.g., `@leads`) MU Proposal-Id: blake3: Approval-Id: blake3: Signer: ed25519: - Expires-At: # OPTIONAL + Expires-At: # OPTIONAL. If present, the approval is only valid until this time. It cannot extend the proposal's expiration. ``` - Grant (at `refs/gatos/grants/…`): From f8de6f234eaa39151251a595ae18390aec388219 Mon Sep 17 00:00:00 2001 From: "J. Kirby Ross" Date: Mon, 10 Nov 2025 06:12:45 -0800 Subject: [PATCH 18/25] docs: Specify HTTP GET method for pointer resolution endpoint --- docs/TECH-SPEC.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/TECH-SPEC.md b/docs/TECH-SPEC.md index 8f927467..74c0da5b 100644 --- a/docs/TECH-SPEC.md +++ b/docs/TECH-SPEC.md @@ -197,7 +197,7 @@ The `PrivateStore` is a pluggable trait, allowing for backends like a local file The `gatosd` daemon exposes a secure endpoint for resolving Opaque Pointers. -- **Endpoint**: `gatosd` will listen for authenticated requests, for example at `/gatos/private/blobs/{digest}`. +- **Endpoint**: `gatosd` will listen for authenticated `GET` requests at `/gatos/private/blobs/{digest}`. - **Authentication**: The client SDK **MUST** send a `Authorization` header containing a JSON Web Signature (JWS) with a detached payload. The JWS payload **MUST** be the BLAKE3 hash of the request body. `gatosd` verifies the signature against the actor's public key. - **Authorization**: Upon receiving a valid request, `gatosd` queries `gatos-policy` to determine if the requesting actor has the capability to access the blob identified by `{digest}`. - **Response**: If authorized, `gatosd` fetches the (likely encrypted) blob from its configured `PrivateStore` and returns it to the client. The client is then responsible for decryption via the `capability` URI. From e00b5780f0245b1fa1e03fc94336867274ce5cb3 Mon Sep 17 00:00:00 2001 From: "J. Kirby Ross" Date: Mon, 10 Nov 2025 06:13:05 -0800 Subject: [PATCH 19/25] docs: Correct type casing in ADR-0004 OpaquePointer diagram --- docs/decisions/ADR-0004/DECISION.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/docs/decisions/ADR-0004/DECISION.md b/docs/decisions/ADR-0004/DECISION.md index 501016e7..de7a356d 100644 --- a/docs/decisions/ADR-0004/DECISION.md +++ b/docs/decisions/ADR-0004/DECISION.md @@ -90,12 +90,12 @@ When private data is elided from the `PublicState`, a canonical JSON **Opaque Po ```mermaid classDiagram class OpaquePointer { - +String kind: "opaque_pointer" - +String algo: "blake3" - +String digest: "blake3:" - +Number size - +String location - +String capability + +string kind: "opaque_pointer" + +string algo: "blake3" + +string digest: "blake3:" + +number size + +string location + +string capability } ``` From e48b7f9ab372a4ed401b3e301d47bcc2a039683b Mon Sep 17 00:00:00 2001 From: "J. Kirby Ross" Date: Mon, 10 Nov 2025 06:17:48 -0800 Subject: [PATCH 20/25] docs: Further refine F9-US-DEV acceptance criteria with BDD-style granularity --- docs/FEATURES.md | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/docs/FEATURES.md b/docs/FEATURES.md index d5dd607d..a12ca4ae 100644 --- a/docs/FEATURES.md +++ b/docs/FEATURES.md @@ -193,9 +193,13 @@ See also: [ADR-0004](./decisions/ADR-0004/DECISION.md). ### F9-US-DEV -**Given** I am an App Developer -**When** I define a piece of state as sensitive in my policy -**Then** that state should be stored in a private store and replaced with an Opaque Pointer in the public state. +#### Acceptance Criteria + +- [ ] **Given** a `policy.yaml` with a rule to `pointerize` the path `sensitive.field`, **when** the state is folded, **then** the resulting public state tree MUST NOT contain the original value of `sensitive.field`. +- [ ] **Given** the same scenario, **when** the state is folded, **then** the public state tree MUST contain a canonical Opaque Pointer object at the `sensitive.field` path. +- [ ] **Given** a `pointerized` field, **when** the Opaque Pointer is generated, **then** its `digest` field MUST match the BLAKE3 hash of the original, private value. +- [ ] **Given** a `pointerized` field, **when** the Opaque Pointer is generated, **then** its `location` and `capability` fields MUST match the values specified in the `policy.yaml` rule. +- [ ] **Given** a valid Opaque Pointer, **when** the Client SDK resolves it with correct authorization, **then** the returned data MUST be byte-for-byte identical to the original private data. ### F9-US-SEC From b8b950458f7483b31621b3bfe01403670dec58c0 Mon Sep 17 00:00:00 2001 From: "J. Kirby Ross" Date: Mon, 10 Nov 2025 06:18:14 -0800 Subject: [PATCH 21/25] docs: Remove redundant acceptance criteria for F9-US-DEV in FEATURES.md --- docs/FEATURES.md | 8 -------- 1 file changed, 8 deletions(-) diff --git a/docs/FEATURES.md b/docs/FEATURES.md index a12ca4ae..5706e3ee 100644 --- a/docs/FEATURES.md +++ b/docs/FEATURES.md @@ -193,14 +193,6 @@ See also: [ADR-0004](./decisions/ADR-0004/DECISION.md). ### F9-US-DEV -#### Acceptance Criteria - -- [ ] **Given** a `policy.yaml` with a rule to `pointerize` the path `sensitive.field`, **when** the state is folded, **then** the resulting public state tree MUST NOT contain the original value of `sensitive.field`. -- [ ] **Given** the same scenario, **when** the state is folded, **then** the public state tree MUST contain a canonical Opaque Pointer object at the `sensitive.field` path. -- [ ] **Given** a `pointerized` field, **when** the Opaque Pointer is generated, **then** its `digest` field MUST match the BLAKE3 hash of the original, private value. -- [ ] **Given** a `pointerized` field, **when** the Opaque Pointer is generated, **then** its `location` and `capability` fields MUST match the values specified in the `policy.yaml` rule. -- [ ] **Given** a valid Opaque Pointer, **when** the Client SDK resolves it with correct authorization, **then** the returned data MUST be byte-for-byte identical to the original private data. - ### F9-US-SEC | | | From 5dca809b723d9b173127684afa7a490d1963ec18 Mon Sep 17 00:00:00 2001 From: "J. Kirby Ross" Date: Mon, 10 Nov 2025 06:18:44 -0800 Subject: [PATCH 22/25] docs: Clarify sorting order for approvals in SPEC.md PoC section --- docs/SPEC.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/SPEC.md b/docs/SPEC.md index 8ced39ae..3eb8523f 100644 --- a/docs/SPEC.md +++ b/docs/SPEC.md @@ -673,7 +673,7 @@ Proposal → Approvals (N‑of‑M) → Grant. Quorum groups (e.g., `@leads`) MU `Proof-Of-Consensus` is the BLAKE3 of a canonical JSON envelope containing: - The canonical proposal envelope (by value or `Proposal-Id`). -- A sorted list (by `Signer`) of all valid approvals used to reach quorum (by value or `Approval-Id`). +- A lexicographically sorted list (by Signer's public key) of all valid approvals used to reach quorum (each by value or `Approval-Id`). - The governance rule id (`Policy-Rule`) and effective quorum parameters. PoC envelope MUST be stored canonically under `refs/gatos/audit/proofs/governance/`; the Grant’s `Proof-Of-Consensus` trailer MUST equal `blake3(envelope_bytes)`. From a7b1a215c9381ac558e3e1f01de175bc45216b9c Mon Sep 17 00:00:00 2001 From: "J. Kirby Ross" Date: Mon, 10 Nov 2025 06:19:49 -0800 Subject: [PATCH 23/25] docs: Correct inconsistent endpoint URL in ADR-0004 --- docs/decisions/ADR-0004/DECISION.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/decisions/ADR-0004/DECISION.md b/docs/decisions/ADR-0004/DECISION.md index de7a356d..3fd6786a 100644 --- a/docs/decisions/ADR-0004/DECISION.md +++ b/docs/decisions/ADR-0004/DECISION.md @@ -146,7 +146,7 @@ A client resolving an Opaque Pointer **MUST** follow this protocol: 2. **Fetch Blob**: - If `gatos-node://`, resolve the actor's endpoint from the trust graph. - The client **MUST** send an authenticated request to the node (e.g., with a JWT or a signed challenge). - - The node's endpoint (e.g., `GET /.well-known/gatos/private/{digest}`) **MUST** verify the client's authorization against its policy before returning the blob. + - The node's endpoint (e.g., `GET /gatos/private/blobs/{digest}`) **MUST** verify the client's authorization against its policy before returning the blob. 3. **Acquire Capability**: - Parse the `capability` URI. - Interact with the specified system (KMS, key server) to get the decryption key. This step will have its own auth/authz protocol. From d3636881691c1d29143dfbc627ef3be91bf22739 Mon Sep 17 00:00:00 2001 From: "J. Kirby Ross" Date: Mon, 10 Nov 2025 08:04:23 -0800 Subject: [PATCH 24/25] docs(ADR-0004,SPEC,TECH-SPEC): switch resolver to POST + JWT; add optional HTTP Message Signatures profile; tighten determinism (RFC 8785 JCS + key ordering); clarify namespaces and error taxonomy; add pointer rotation trailer\nschema(privacy): add ciphertext_digest + extensions; bump to draft 2020-12\nschema(policy): add optional privacy.classes + rules\nexamples: add privacy_min and updated opaque_pointer example\nscripts: validate privacy schema/examples in CI\nmake: fix tab indentation for lint/fix targets\nchore: add gatos-privacy crate with OpaquePointer type and notes --- Cargo.lock | 12 +++ Cargo.toml | 1 + Makefile | 10 +- crates/gatos-privacy/Cargo.toml | 13 +++ crates/gatos-privacy/src/lib.rs | 42 ++++++++ docs/SPEC.md | 50 +++++++--- docs/TECH-SPEC.md | 27 ++++- docs/decisions/ADR-0003/DECISION.md | 2 +- docs/decisions/ADR-0004/DECISION.md | 99 +++++++++++++++---- examples/v1/policy/privacy_min.json | 21 ++++ examples/v1/privacy/opaque_pointer_min.json | 9 ++ .../v1/policy/governance_policy.schema.json | 34 +++++++ schemas/v1/privacy/opaque_pointer.schema.json | 50 +++------- scripts/validate_schemas.sh | 4 + 14 files changed, 293 insertions(+), 81 deletions(-) create mode 100644 crates/gatos-privacy/Cargo.toml create mode 100644 crates/gatos-privacy/src/lib.rs create mode 100644 examples/v1/policy/privacy_min.json create mode 100644 examples/v1/privacy/opaque_pointer_min.json diff --git a/Cargo.lock b/Cargo.lock index 6c947707..b310fe4f 100644 --- a/Cargo.lock +++ b/Cargo.lock @@ -328,6 +328,18 @@ version = "0.1.0" name = "gatos-policy" version = "0.1.0" +[[package]] +name = "gatos-privacy" +version = "0.1.0" +dependencies = [ + "anyhow", + "blake3", + "gatos-ledger-core", + "hex", + "serde", + "serde_json", +] + [[package]] name = "gatos-wasm-bindings" version = "0.1.0" diff --git a/Cargo.toml b/Cargo.toml index 67934114..9279a17a 100644 --- a/Cargo.toml +++ b/Cargo.toml @@ -7,6 +7,7 @@ members = [ "crates/gatos-mind", "crates/gatos-echo", "crates/gatos-policy", + "crates/gatos-privacy", "crates/gatos-kv", "crates/gatosd", "bindings/wasm", diff --git a/Makefile b/Makefile index 5259946a..39b8f699 100644 --- a/Makefile +++ b/Makefile @@ -12,14 +12,14 @@ diagrams: @bash -lc 'scripts/mermaid/generate_all.sh' lint-md: - @bash -lc 'if command -v node >/dev/null 2>&1; then \ + @bash -lc 'if command -v node >/dev/null 2>&1; then \ npx -y markdownlint-cli "**/*.md" --config .markdownlint.json; \ elif command -v docker >/dev/null 2>&1; then \ docker run --rm -v "$$PWD:/work" -w /work node:20 bash -lc "npx -y markdownlint-cli \"**/*.md\" --config .markdownlint.json"; \ else echo "Need Node.js or Docker" >&2; exit 1; fi' fix-md: - @bash -lc 'if command -v node >/dev/null 2>&1; then \ + @bash -lc 'if command -v node >/dev/null 2>&1; then \ npx -y markdownlint-cli "**/*.md" --fix --config .markdownlint.json; \ elif command -v docker >/dev/null 2>&1; then \ docker run --rm -v "$$PWD:/work" -w /work node:20 bash -lc "npx -y markdownlint-cli \"**/*.md\" --fix --config .markdownlint.json"; \ @@ -44,7 +44,8 @@ schema-compile: npx -y ajv-cli@5 ajv compile --spec=draft2020 --strict=true -c ajv-formats -s schemas/v1/governance/grant.schema.json -r schemas/v1/common/ids.schema.json && \ npx -y ajv-cli@5 ajv compile --spec=draft2020 --strict=true -c ajv-formats -s schemas/v1/governance/revocation.schema.json -r schemas/v1/common/ids.schema.json && \ npx -y ajv-cli@5 ajv compile --spec=draft2020 --strict=true -c ajv-formats -s schemas/v1/governance/proof_of_consensus_envelope.schema.json -r schemas/v1/common/ids.schema.json && \ - npx -y ajv-cli@5 ajv compile --spec=draft2020 --strict=true -c ajv-formats -s schemas/v1/policy/governance_policy.schema.json' + npx -y ajv-cli@5 ajv compile --spec=draft2020 --strict=true -c ajv-formats -s schemas/v1/policy/governance_policy.schema.json && \ + npx -y ajv-cli@5 ajv compile --spec=draft2020 --strict=true -c ajv-formats -s schemas/v1/privacy/opaque_pointer.schema.json -r schemas/v1/common/ids.schema.json' schema-validate: @bash -lc 'set -euo pipefail; \ @@ -57,7 +58,8 @@ schema-validate: npx -y ajv-cli@5 ajv validate --spec=draft2020 --strict=true -c ajv-formats -s schemas/v1/governance/grant.schema.json -d examples/v1/governance/grant_min.json -r schemas/v1/common/ids.schema.json && \ npx -y ajv-cli@5 ajv validate --spec=draft2020 --strict=true -c ajv-formats -s schemas/v1/governance/revocation.schema.json -d examples/v1/governance/revocation_min.json -r schemas/v1/common/ids.schema.json && \ npx -y ajv-cli@5 ajv validate --spec=draft2020 --strict=true -c ajv-formats -s schemas/v1/governance/proof_of_consensus_envelope.schema.json -d examples/v1/governance/poc_envelope_min.json -r schemas/v1/common/ids.schema.json && \ - npx -y ajv-cli@5 ajv validate --spec=draft2020 --strict=true -c ajv-formats -s schemas/v1/policy/governance_policy.schema.json -d examples/v1/policy/governance_min.json' + npx -y ajv-cli@5 ajv validate --spec=draft2020 --strict=true -c ajv-formats -s schemas/v1/policy/governance_policy.schema.json -d examples/v1/policy/governance_min.json && \ + npx -y ajv-cli@5 ajv validate --spec=draft2020 --strict=true -c ajv-formats -s schemas/v1/privacy/opaque_pointer.schema.json -d examples/v1/privacy/opaque_pointer_min.json -r schemas/v1/common/ids.schema.json' schema-negative: @bash -lc 'set -euo pipefail; \ diff --git a/crates/gatos-privacy/Cargo.toml b/crates/gatos-privacy/Cargo.toml new file mode 100644 index 00000000..631ea04c --- /dev/null +++ b/crates/gatos-privacy/Cargo.toml @@ -0,0 +1,13 @@ +[package] +name = "gatos-privacy" +version = "0.1.0" +edition = "2021" + +[dependencies] +gatos-ledger-core = { path = "../gatos-ledger-core" } +serde = { workspace = true, features = ["derive"] } +serde_json = { workspace = true } +blake3 = { workspace = true } +hex = { workspace = true } +anyhow = { workspace = true } + diff --git a/crates/gatos-privacy/src/lib.rs b/crates/gatos-privacy/src/lib.rs new file mode 100644 index 00000000..28e54abb --- /dev/null +++ b/crates/gatos-privacy/src/lib.rs @@ -0,0 +1,42 @@ +//! gatos-privacy — Opaque Pointer types and helpers +//! +//! This crate defines the JSON-facing pointer envelope used by the +//! hybrid privacy model (ADR-0004). The struct mirrors the v1 schema +//! in `schemas/v1/privacy/opaque_pointer.schema.json`. +//! +//! Canonicalization: when computing content IDs or digests, callers +//! MUST serialize JSON using RFC 8785 JCS. This crate intentionally +//! does not take a dependency on a specific JCS implementation to +//! keep the workspace lean; higher layers may provide one. + +use serde::{Deserialize, Serialize}; +use serde_json::Value; + +#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)] +#[serde(deny_unknown_fields)] +pub struct OpaquePointer { + pub kind: Kind, + pub algo: Algo, + pub digest: String, + #[serde(skip_serializing_if = "Option::is_none")] + pub ciphertext_digest: Option, + #[serde(skip_serializing_if = "Option::is_none")] + pub size: Option, + pub location: String, + pub capability: String, + #[serde(skip_serializing_if = "Option::is_none")] + pub extensions: Option, +} + +#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize)] +#[serde(rename_all = "snake_case")] +pub enum Kind { + OpaquePointer, +} + +#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize)] +#[serde(rename_all = "lowercase")] +pub enum Algo { + Blake3, +} + diff --git a/docs/SPEC.md b/docs/SPEC.md index 3eb8523f..c1f19549 100644 --- a/docs/SPEC.md +++ b/docs/SPEC.md @@ -314,27 +314,49 @@ classDiagram class OpaquePointer { +string kind: "opaque_pointer" +string algo: "blake3" - +string digest: "blake3:" - +number size + +string digest: "blake3:" // plaintext digest + +string ciphertext_digest: "blake3:" // optional + +int size // bytes; SHOULD be present +string location - +string capability + +string capability // MUST NOT embed secrets + +object extensions // forward-compatible } ``` -- `digest`: The **REQUIRED** `blake3` hash of the raw private data. This ensures the integrity of the private blob. -- `size`: The size of the private blob in bytes. -- `location`: A **REQUIRED** URI indicating where the blob can be fetched (e.g., `gatos-node://ed25519:`, `s3://...`). -- `capability`: A **REQUIRED** URI defining the auth/authz and decryption mechanism needed to access the blob (e.g., `gatos-key://...`, `kms://...`). +- `digest`: The **REQUIRED** `blake3` hash of the plaintext. For low‑entropy privacy classes, the public pointer MUST NOT expose this value. +- `ciphertext_digest`: The `blake3` hash of the stored ciphertext. For low‑entropy privacy classes, this field MUST be present in the public pointer. +- `size`: The size of the private blob in bytes (RECOMMENDED). +- `location`: A **REQUIRED** stable URI indicating where the blob can be fetched (e.g., `gatos-node://ed25519:`, `s3://bucket/key`). Do not embed pre‑signed tokens. +- `capability`: A **REQUIRED** reference to the authn/z + decryption mechanism (e.g., `gatos-key://...`, `kms://...`). It MUST NOT embed secrets; resolution occurs at the policy layer. -The pointer itself is canonicalized and its `content_id` can be computed for verification purposes. +The pointer itself is canonicalized via RFC 8785 JCS and its `content_id` is `blake3(JCS(pointer_json))`. ### 7.3 Pointer Resolution -A client resolving an Opaque Pointer **MUST** perform the following steps: -1. Fetch the private blob from the `location` URI, authenticating if required by the endpoint protocol. -2. Acquire the necessary authorization and/or decryption keys by interacting with the `capability` URI's system. -3. If the blob is encrypted, decrypt it. -4. Verify that the `blake3` hash of the resulting plaintext exactly matches the `digest` in the pointer. If the hashes do not match, the resolution **MUST** fail with a `DigestMismatch` error, and the client **SHOULD** log a security warning, as this may indicate data tampering. +Endpoint and AuthN: +- Clients MUST resolve via `POST /gatos/private/blobs/resolve` with body `{ "digest": "blake3:", "want": "plaintext"|"ciphertext" }` and `Authorization: Bearer `. +- Tokens MUST include standard claims (`sub`, `aud`, `method`, `path`, `exp`, `nbf`); skew tolerance ±300s. 401 for authn failures; 403 for policy denials. + +Verification Steps: +1. Fetch the ciphertext blob from `location` via the node’s resolver endpoint. +2. Acquire the necessary keys via the `capability` reference (policy-driven; no secrets in the pointer). +3. Decrypt. Compute `blake3(ciphertext)` and compare with `ciphertext_digest` when present; compute `blake3(plaintext)` and compare with `digest` when exposed. Any mismatch MUST yield `DigestMismatch`. +4. Servers SHOULD return `X-BLAKE3-Digest` and `Digest: sha-256=…` headers for response integrity. + +Error Taxonomy: +- `Unauthorized` (401), `Forbidden` (403), `NotFound` (404), `DigestMismatch` (409), `CapabilityUnavailable` (503), `PolicyDenied` (403). + +Optional HTTP Message Signatures profile (RFC 9421): +- As an alternative to JWT, clients MAY sign `@method`, `@target-uri`, `date`, `host`, `content-digest` and send `Signature-Input`/`Signature` headers. Servers SHOULD still emit `Digest` and `X-BLAKE3-Digest` response headers. + +Pointer Rotation (Rekey): +1) fetch ciphertext; 2) decrypt; 3) re‑encrypt per new capability; 4) store new ciphertext; 5) emit rotation event updating pointer fields (capability/location). `digest` (plaintext) MUST remain stable. Add trailer `Privacy-Pointer-Rotations: `. + +Namespacing: +- `refs/gatos/private//…` holds private overlay indices/metadata only; workspace mirror is `gatos/private//…`. Blobs live in external stores keyed by digest. + +Canonicalization: +- All JSON labeled as canonical MUST use RFC 8785 JCS; non‑JSON maps MUST be ordered lexicographically by lowercase UTF‑8 keys. This process guarantees that even though the data is stored privately, its integrity is verifiable against the public ledger. @@ -673,7 +695,7 @@ Proposal → Approvals (N‑of‑M) → Grant. Quorum groups (e.g., `@leads`) MU `Proof-Of-Consensus` is the BLAKE3 of a canonical JSON envelope containing: - The canonical proposal envelope (by value or `Proposal-Id`). -- A lexicographically sorted list (by Signer's public key) of all valid approvals used to reach quorum (each by value or `Approval-Id`). +- A lexicographically sorted list of approvals ordered by the lowercase ASCII of each approval's `Signer` value (the `ed25519:` string). Each approval is included by value or via `Approval-Id`. - The governance rule id (`Policy-Rule`) and effective quorum parameters. PoC envelope MUST be stored canonically under `refs/gatos/audit/proofs/governance/`; the Grant’s `Proof-Of-Consensus` trailer MUST equal `blake3(envelope_bytes)`. diff --git a/docs/TECH-SPEC.md b/docs/TECH-SPEC.md index 74c0da5b..dfe0599e 100644 --- a/docs/TECH-SPEC.md +++ b/docs/TECH-SPEC.md @@ -197,10 +197,29 @@ The `PrivateStore` is a pluggable trait, allowing for backends like a local file The `gatosd` daemon exposes a secure endpoint for resolving Opaque Pointers. -- **Endpoint**: `gatosd` will listen for authenticated `GET` requests at `/gatos/private/blobs/{digest}`. -- **Authentication**: The client SDK **MUST** send a `Authorization` header containing a JSON Web Signature (JWS) with a detached payload. The JWS payload **MUST** be the BLAKE3 hash of the request body. `gatosd` verifies the signature against the actor's public key. -- **Authorization**: Upon receiving a valid request, `gatosd` queries `gatos-policy` to determine if the requesting actor has the capability to access the blob identified by `{digest}`. -- **Response**: If authorized, `gatosd` fetches the (likely encrypted) blob from its configured `PrivateStore` and returns it to the client. The client is then responsible for decryption via the `capability` URI. +- Endpoint: `POST /gatos/private/blobs/resolve` +- Content-Type: `application/json` +- Request body (JCS canonical JSON): + ```json + { "digest": "blake3:", "want": "plaintext" } + ``` + - `want` OPTIONAL: `"plaintext" | "ciphertext"` (default `"plaintext"`). +- Authentication: `Authorization: Bearer ` + - Claims (example): `iss`, `sub` (ed25519:), `aud` ("gatos-node:"), `exp`, `nbf`, `jti`, `method` ("POST"), `path` ("/gatos/private/blobs/resolve"), `digest` (MUST match body.digest). + - Clock skew tolerance: ±300 seconds. +- Authorization: Node evaluates policy for `` on ``. +- Response (200 OK): + - Headers: `Digest: sha-256=`, `X-BLAKE3-Digest: blake3:` + - Body: requested bytes (ciphertext or plaintext). + +Errors: 401 Unauthorized, 403 Forbidden, 404 Not Found, 409 DigestMismatch, 503 CapabilityUnavailable. + +Optional profile (HTTP Message Signatures, RFC 9421): +- Clients MAY authenticate by signing components: `@method`, `@target-uri`, `date`, `host`, `content-digest` (SHA-256 over request body) and sending `Signature-Input: sig1=...` and `Signature: sig1=::`. +- Servers STILL apply policy and SHOULD return `Digest` and `X-BLAKE3-Digest` headers. + +Pointer Rotation (Rekey): +- Implement a rotation that: (1) fetches; (2) decrypts; (3) re‑encrypts; (4) stores; (5) emits an audit event updating pointer fields while keeping plaintext `digest` stable. Add trailer `Privacy-Pointer-Rotations: ` when a projection commit includes rotations. --- diff --git a/docs/decisions/ADR-0003/DECISION.md b/docs/decisions/ADR-0003/DECISION.md index 45f29beb..d33c69a5 100644 --- a/docs/decisions/ADR-0003/DECISION.md +++ b/docs/decisions/ADR-0003/DECISION.md @@ -86,7 +86,7 @@ Define a system for gating specific GATOS actions (e.g., locking a file, publish 7. Proof‑Of‑Consensus (normative) - The `Proof-Of-Consensus` digest MUST be the BLAKE3 of a canonical envelope that includes (see schema: [`schemas/v1/governance/proof_of_consensus_envelope.schema.json`](../../../schemas/v1/governance/proof_of_consensus_envelope.schema.json)): - The canonical proposal envelope (by value or by `Proposal-Id`). - - A sorted list (by `Signer`) of all valid approvals used to reach quorum (each by value or `Approval-Id`). + - A lexicographically sorted list of approvals by the lowercase ASCII of each approval's `Signer` value (the `ed25519:` string). Each approval is included by value or via `Approval-Id`. - The governance rule id (`Policy-Rule`) and effective quorum parameters. - Implementations MUST use canonical JSON (UTF‑8, sorted keys, no insignificant whitespace) to build this envelope before hashing. All hex encodings MUST be lowercase. Ordering by signer is an application‑level MUST; JSON Schema cannot enforce sort order. - Storage: The canonical PoC envelope JSON SHOULD be persisted as a blob referenced under `refs/gatos/audit/proofs/governance/`; the `Proof-Of-Consensus` trailer MUST equal `blake3(envelope_bytes)`. diff --git a/docs/decisions/ADR-0004/DECISION.md b/docs/decisions/ADR-0004/DECISION.md index 3fd6786a..13804d5e 100644 --- a/docs/decisions/ADR-0004/DECISION.md +++ b/docs/decisions/ADR-0004/DECISION.md @@ -92,23 +92,25 @@ classDiagram class OpaquePointer { +string kind: "opaque_pointer" +string algo: "blake3" - +string digest: "blake3:" - +number size + +string digest: "blake3:" // plaintext digest + +string ciphertext_digest "blake3:" // MAY be present + +int size // SHOULD be present (bytes) +string location - +string capability + +string capability // MUST NOT embed secrets + +object extensions // forward-compatible } ``` -- **`digest`**: The content-address of the private blob (`blake3(private_bytes)`). This is the immutable link between the public and private worlds. +- **`digest`**: The content-address of the private plaintext (`blake3(plaintext_bytes)`). This is the immutable link between the public and private worlds. +- **`ciphertext_digest`**: The content-address of the stored ciphertext (`blake3(ciphertext_bytes)`). For low‑entropy privacy classes (see Policy Hooks), the public pointer **MUST** include `ciphertext_digest` and policy **MUST NOT** expose the plaintext digest publicly. - **`location`**: A URI indicating where to resolve the blob. Supported schemes include: - `gatos-node://ed25519:`: Resolve via the GATOS trust graph. - `https://...`, `s3://...`, `ipfs://...`: Standard distributed storage. - `file:///...`: For local development and testing. -- **`capability`**: A URI defining the authorization and decryption mechanism required to access the blob. - - `gatos-key://v1/aes-256-gcm/`: A symmetric key managed by a GATOS-aware key service. - - `kms://...`, `age://...`, `sops://...`: Integration with standard secret management tools. +- **`capability`**: A reference identifying the authorization and decryption mechanism required to access the blob. It **MUST NOT** embed secrets or pre‑signed tokens. It SHOULD be a stable identifier (e.g., `gatos-key://v1/aes-256-gcm/` or `kms://...`) that can be resolved privately at the policy layer. + - Pointers MAY publish a non‑sensitive label and keep resolver details private via policy. Implementations MAY also place auxiliary hints inside `extensions`. -The canonical `content_id` of the pointer itself is `blake3(canonical_json_bytes)`. +The canonical `content_id` of the pointer itself is `blake3(JCS(pointer_json))`, where `JCS(…)` denotes RFC 8785 JSON Canonicalization Scheme applied to UTF‑8 bytes. This rule is normative for all canonical JSON in GATOS (pointers, governance envelopes, any JSON state snapshots). **Schema:** `schemas/v1/privacy/opaque_pointer.schema.json` @@ -123,6 +125,10 @@ The State Engine (`gatos-echo`) is responsible for executing the projection. - `pointerize`: The field's value is stored as a private blob, and an Opaque Pointer is substituted in the public state. 4. The resulting `PublicState` is committed to the public refs, and the `Private Blobs` are persisted to their specified `location`. +Determinism Requirements: +- All JSON artifacts produced during projection (including Opaque Pointers) MUST be canonicalized with RFC 8785 JCS prior to hashing. +- When non‑JSON maps are materialized (e.g., Git tree entries), keys MUST be ordered lexicographically by their lowercase UTF‑8 bytes. + ```mermaid sequenceDiagram participant E as State Engine (gatos-echo) @@ -140,20 +146,38 @@ sequenceDiagram ### 4. Pointer Resolution Protocol (Normative) +Authentication semantics are aligned with HTTP. We adopt a simple, interoperable model (JWT default; HTTP Message Signatures optional): + +- **Endpoint**: `POST /gatos/private/blobs/resolve` +- **Request Body (application/json; JCS canonical form)**: + `{ "digest": "blake3:", "want": "plaintext"|"ciphertext" }` +- **Authorization**: `Authorization: Bearer ` + - Claims MUST include: `sub` (ed25519:), `aud` (node id or URL), `method` ("POST"), `path` ("/gatos/private/blobs/resolve"), `exp`, and `nbf`. + - Clock skew tolerance: ±300 seconds. + - On missing/invalid token: `401 Unauthorized`. On policy denial: `403 Forbidden`. + A client resolving an Opaque Pointer **MUST** follow this protocol: -1. **Parse Pointer**: Extract `digest`, `location`, and `capability`. +1. **Parse Pointer**: Extract `digest`, optional `ciphertext_digest`, `location`, and `capability`. 2. **Fetch Blob**: - - If `gatos-node://`, resolve the actor's endpoint from the trust graph. - - The client **MUST** send an authenticated request to the node (e.g., with a JWT or a signed challenge). - - The node's endpoint (e.g., `GET /gatos/private/blobs/{digest}`) **MUST** verify the client's authorization against its policy before returning the blob. + - If `gatos-node://`, resolve the actor's endpoint from the trust graph, then `POST /gatos/private/blobs/resolve` with the body above. + - The node **MUST** verify the bearer token and enforce policy before returning the blob. 3. **Acquire Capability**: - - Parse the `capability` URI. - - Interact with the specified system (KMS, key server) to get the decryption key. This step will have its own auth/authz protocol. + - Resolve the `capability` reference via the configured key system (KMS, key server). Secrets MUST NOT be embedded in the pointer. 4. **Decrypt and Verify**: - - Decrypt the fetched blob using the key. - - Compute `blake3(decrypted_bytes)`. - - The operation **MUST FAIL** if the computed hash does not exactly match the `digest` in the pointer. + - Decrypt the fetched blob using the resolved key and AAD parameters (see Security Notes). + - Compute `blake3(plaintext)` and compare to `digest` if published; compute `blake3(ciphertext)` and compare to `ciphertext_digest` if published. A mismatch **MUST** produce `DigestMismatch`. + +Response headers on success: +``` +Content-Type: application/octet-stream +X-BLAKE3-Digest: blake3: +Digest: sha-256= +``` + +Optional HTTP Message Signatures profile (RFC 9421): +- Clients MAY authenticate by signing `@method`, `@target-uri`, `date`, `host`, `content-digest` (SHA‑256 of the JSON body) and sending `Signature-Input` and `Signature` headers. +- Servers SHOULD still return `Digest` and `X-BLAKE3-Digest` headers for response integrity. ```mermaid sequenceDiagram @@ -162,7 +186,7 @@ sequenceDiagram participant KMS as Key Management Service C->>C: 1. Read OpaquePointer - C->>PN: 2. GET /private/{digest} (Authenticated) + C->>PN: 2. POST /gatos/private/blobs/resolve (Authorization: Bearer ) PN->>PN: 3. Check policy (is C allowed?) alt Authorized PN-->>C: 4. Return encrypted blob @@ -171,7 +195,7 @@ sequenceDiagram C->>C: 7. Decrypt blob C->>C: 8. Verify blake3(decrypted) == digest else Unauthorized - PN-->>C: 4. Return 403 Forbidden + PN-->>C: 4. Return 401/403 end ``` @@ -181,9 +205,15 @@ The privacy policy is defined in `.gatos/policy.yaml` and extends the policy eng ```yaml privacy: + classes: + pii_low_entropy: + min_entropy_bits: 40 + publish_plaintext_digest: false + require_ciphertext_digest: true rules: - select: "path.to.sensitive.data" action: "pointerize" + class: "pii_low_entropy" capability: "gatos-key://v1/aes-256-gcm/ops-key-01" location: "gatos-node://ed25519:" - select: "path.to.transient.data" @@ -199,6 +229,7 @@ To make privacy operations transparent and auditable, any commit that creates a ``` Privacy-Redactions: 3 Privacy-Pointers: 12 +Privacy-Pointer-Rotations: 1 ``` This provides a simple, top-level indicator that a projection has occurred, prompting auditors to look deeper if necessary. @@ -223,3 +254,33 @@ This provides a simple, top-level indicator that a projection has occurred, prom - **Large Artifact Management**: Handle large binaries (ML models, videos) without bloating the Git repository. - **Compliant Data Sharing**: Share a public, redacted dataset with third parties while retaining private access to the full, unified view. - **Federated Learning**: Different actors can hold private models locally, referenced by pointers in a public "training plan" shape. + +--- + +## Namespacing and Storage (Normative) + +- Private overlays are actor‑anchored: `refs/gatos/private///` index metadata. The local workspace mirror is `gatos/private///`. +- Private blobs themselves are NOT stored under Git refs. They live in pluggable blob stores and are addressed by their `ciphertext_digest`/`digest`. + +## Security & Privacy Notes (Normative) + +- Capability references in pointers MUST NOT contain secrets or pre‑signed tokens. Use stable identifiers and resolve sensitive data via policy. +- AES‑256‑GCM (if used) MUST include AAD composed of: actor id, pointer `content_id`, and policy version; nonces MUST be 96‑bit, randomly generated, and never reused per key. +- Right‑to‑be‑forgotten: deleting private blobs breaks pointer resolution but does not remove the public pointer. Implement erasure as a tombstone event plus an audit record. + +### Algorithm variants (experimental; private attestations only) + +- Implementations MAY use a keyed BLAKE3 variant for private attestation envelopes (not for public Opaque Pointers): `algo = "blake3-keyed"` with parameters encoded in an envelope or pointer `extensions` field. +- Recommended KDF: `hkdf-sha256`; context string `"gatos:ptr:priv:"`; derive `key = HKDF(policy_key, salt = actor_pubkey, info = context)`. +- Public pointers MUST continue to use `algo = "blake3"` for third‑party verifiability. + +## Error Taxonomy (Normative) + +Implementations SHOULD use a stable set of error codes with JSON problem details: + +- `Unauthorized` (401) +- `Forbidden` (403) +- `NotFound` (404) +- `DigestMismatch` (422) +- `CapabilityUnavailable` (503) +- `PolicyDenied` (403) diff --git a/examples/v1/policy/privacy_min.json b/examples/v1/policy/privacy_min.json new file mode 100644 index 00000000..c510c3b9 --- /dev/null +++ b/examples/v1/policy/privacy_min.json @@ -0,0 +1,21 @@ +{ + "privacy": { + "classes": { + "pii_low_entropy": { + "min_entropy_bits": 40, + "publish_plaintext_digest": false, + "require_ciphertext_digest": true + } + }, + "rules": [ + { + "select": "user.email", + "action": "pointerize", + "class": "pii_low_entropy", + "capability": "gatos-key://v1/aes-256-gcm/ops-key-01", + "location": "gatos-node://ed25519:AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA" + } + ] + } +} + diff --git a/examples/v1/privacy/opaque_pointer_min.json b/examples/v1/privacy/opaque_pointer_min.json new file mode 100644 index 00000000..eb3cf057 --- /dev/null +++ b/examples/v1/privacy/opaque_pointer_min.json @@ -0,0 +1,9 @@ +{ + "kind": "opaque_pointer", + "algo": "blake3", + "digest": "blake3:0000000000000000000000000000000000000000000000000000000000000000", + "ciphertext_digest": "blake3:1111111111111111111111111111111111111111111111111111111111111111", + "size": 0, + "location": "gatos-node://ed25519:AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA", + "capability": "gatos-key://v1/aes-256-gcm/test-key-01" +} diff --git a/schemas/v1/policy/governance_policy.schema.json b/schemas/v1/policy/governance_policy.schema.json index 5c738b7d..42ff6baa 100644 --- a/schemas/v1/policy/governance_policy.schema.json +++ b/schemas/v1/policy/governance_policy.schema.json @@ -46,5 +46,39 @@ } } } + , + "privacy": { + "type": "object", + "additionalProperties": false, + "properties": { + "classes": { + "type": "object", + "additionalProperties": { + "type": "object", + "additionalProperties": false, + "properties": { + "min_entropy_bits": { "type": "integer", "minimum": 0 }, + "publish_plaintext_digest": { "type": "boolean" }, + "require_ciphertext_digest": { "type": "boolean" } + } + } + }, + "rules": { + "type": "array", + "items": { + "type": "object", + "additionalProperties": false, + "properties": { + "select": { "type": "string" }, + "action": { "type": "string", "enum": ["redact", "pointerize"] }, + "class": { "type": "string" }, + "capability": { "type": "string", "format": "uri" }, + "location": { "type": "string", "format": "uri" } + }, + "required": ["select", "action"] + } + } + } + } } } diff --git a/schemas/v1/privacy/opaque_pointer.schema.json b/schemas/v1/privacy/opaque_pointer.schema.json index 78a4d61b..bb8d3475 100644 --- a/schemas/v1/privacy/opaque_pointer.schema.json +++ b/schemas/v1/privacy/opaque_pointer.schema.json @@ -1,46 +1,18 @@ { - "$schema": "http://json-schema.org/draft-07/schema#", + "$schema": "https://json-schema.org/draft/2020-12/schema", "title": "GATOS Opaque Pointer", - "description": "A canonical pointer to a private data blob, used to replace sensitive or large data in a public state projection.", + "description": "Canonical pointer to a private blob used in public projections.", "type": "object", "properties": { - "kind": { - "description": "The object kind, MUST be 'opaque_pointer'.", - "type": "string", - "const": "opaque_pointer" - }, - "algo": { - "description": "The hashing algorithm used for the digest, MUST be 'blake3'.", - "type": "string", - "const": "blake3" - }, - "digest": { - "description": "The content-address of the private blob, prefixed with the algorithm.", - "type": "string", - "pattern": "^blake3:[a-f0-9]{64}$" - }, - "size": { - "description": "Optional: The size of the private blob in bytes.", - "type": "integer", - "minimum": 0 - }, - "location": { - "description": "A URI indicating where the private blob can be resolved.", - "type": "string", - "format": "uri" - }, - "capability": { - "description": "A URI defining the authorization and/or decryption mechanism for the blob.", - "type": "string", - "format": "uri" - } + "kind": { "type": "string", "const": "opaque_pointer" }, + "algo": { "type": "string", "const": "blake3" }, + "digest": { "type": "string", "pattern": "^blake3:[a-f0-9]{64}$" }, + "ciphertext_digest": { "type": "string", "pattern": "^blake3:[a-f0-9]{64}$" }, + "size": { "type": "integer", "minimum": 0 }, + "location": { "type": "string", "format": "uri" }, + "capability": { "type": "string", "format": "uri" }, + "extensions": { "type": "object" } }, - "required": [ - "kind", - "algo", - "digest", - "location", - "capability" - ], + "required": ["kind","algo","digest","location","capability"], "additionalProperties": false } diff --git a/scripts/validate_schemas.sh b/scripts/validate_schemas.sh index 6f0233d1..3771b34b 100755 --- a/scripts/validate_schemas.sh +++ b/scripts/validate_schemas.sh @@ -18,6 +18,7 @@ SCHEMAS=( "schemas/v1/governance/revocation.schema.json" "schemas/v1/governance/proof_of_consensus_envelope.schema.json" "schemas/v1/policy/governance_policy.schema.json" + "schemas/v1/privacy/opaque_pointer.schema.json" ) for schema in "${SCHEMAS[@]}"; do @@ -38,6 +39,7 @@ declare -A EXAMPLES=( ["schemas/v1/governance/grant.schema.json"]="examples/v1/governance/grant_min.json" ["schemas/v1/governance/revocation.schema.json"]="examples/v1/governance/revocation_min.json" ["schemas/v1/governance/proof_of_consensus_envelope.schema.json"]="examples/v1/governance/poc_envelope_min.json" + ["schemas/v1/privacy/opaque_pointer.schema.json"]="examples/v1/privacy/opaque_pointer_min.json" ) for schema in "${!EXAMPLES[@]}"; do @@ -52,6 +54,8 @@ done echo " - ajv validate: examples/v1/policy/governance_min.json against schemas/v1/policy/governance_policy.schema.json" ajv validate "${AJV_BASE_ARGS[@]}" -s schemas/v1/policy/governance_policy.schema.json -d examples/v1/policy/governance_min.json +echo " - ajv validate: examples/v1/policy/privacy_min.json against schemas/v1/policy/governance_policy.schema.json" +ajv validate "${AJV_BASE_ARGS[@]}" -s schemas/v1/policy/governance_policy.schema.json -d examples/v1/policy/privacy_min.json echo "[schemas] Additional encoding tests (ed25519 base64url forms)…" # Root schemas that reference defs using the canonical $id for proper resolution From 4bbbccb430cb39c4e254d51f3157399192257c03 Mon Sep 17 00:00:00 2001 From: "J. Kirby Ross" Date: Mon, 10 Nov 2025 08:30:30 -0800 Subject: [PATCH 25/25] docs: harmonize DigestMismatch to 422 across SPEC/TECH-SPEC; ADR response digest headers reflect response body\nschema(privacy): allow ciphertext-only pointers via anyOf; keep kind/algo/location/capability required\ndocs(ADR-0004): fix colon in class diagram;\ndocs(ADR-0003): PoC storage level to MUST --- docs/SPEC.md | 2 +- docs/TECH-SPEC.md | 2 +- docs/decisions/ADR-0003/DECISION.md | 2 +- docs/decisions/ADR-0004/DECISION.md | 6 +++--- schemas/v1/privacy/opaque_pointer.schema.json | 16 +++++++++++++++- 5 files changed, 21 insertions(+), 7 deletions(-) diff --git a/docs/SPEC.md b/docs/SPEC.md index c1f19549..728c4544 100644 --- a/docs/SPEC.md +++ b/docs/SPEC.md @@ -344,7 +344,7 @@ Verification Steps: 4. Servers SHOULD return `X-BLAKE3-Digest` and `Digest: sha-256=…` headers for response integrity. Error Taxonomy: -- `Unauthorized` (401), `Forbidden` (403), `NotFound` (404), `DigestMismatch` (409), `CapabilityUnavailable` (503), `PolicyDenied` (403). +- `Unauthorized` (401), `Forbidden` (403), `NotFound` (404), `DigestMismatch` (422), `CapabilityUnavailable` (503), `PolicyDenied` (403). Optional HTTP Message Signatures profile (RFC 9421): - As an alternative to JWT, clients MAY sign `@method`, `@target-uri`, `date`, `host`, `content-digest` and send `Signature-Input`/`Signature` headers. Servers SHOULD still emit `Digest` and `X-BLAKE3-Digest` response headers. diff --git a/docs/TECH-SPEC.md b/docs/TECH-SPEC.md index dfe0599e..ecddb6b9 100644 --- a/docs/TECH-SPEC.md +++ b/docs/TECH-SPEC.md @@ -212,7 +212,7 @@ The `gatosd` daemon exposes a secure endpoint for resolving Opaque Pointers. - Headers: `Digest: sha-256=`, `X-BLAKE3-Digest: blake3:` - Body: requested bytes (ciphertext or plaintext). -Errors: 401 Unauthorized, 403 Forbidden, 404 Not Found, 409 DigestMismatch, 503 CapabilityUnavailable. +Errors: 401 Unauthorized, 403 Forbidden, 404 Not Found, 422 DigestMismatch, 503 CapabilityUnavailable. Optional profile (HTTP Message Signatures, RFC 9421): - Clients MAY authenticate by signing components: `@method`, `@target-uri`, `date`, `host`, `content-digest` (SHA-256 over request body) and sending `Signature-Input: sig1=...` and `Signature: sig1=::`. diff --git a/docs/decisions/ADR-0003/DECISION.md b/docs/decisions/ADR-0003/DECISION.md index d33c69a5..3e4683ee 100644 --- a/docs/decisions/ADR-0003/DECISION.md +++ b/docs/decisions/ADR-0003/DECISION.md @@ -89,7 +89,7 @@ Define a system for gating specific GATOS actions (e.g., locking a file, publish - A lexicographically sorted list of approvals by the lowercase ASCII of each approval's `Signer` value (the `ed25519:` string). Each approval is included by value or via `Approval-Id`. - The governance rule id (`Policy-Rule`) and effective quorum parameters. - Implementations MUST use canonical JSON (UTF‑8, sorted keys, no insignificant whitespace) to build this envelope before hashing. All hex encodings MUST be lowercase. Ordering by signer is an application‑level MUST; JSON Schema cannot enforce sort order. - - Storage: The canonical PoC envelope JSON SHOULD be persisted as a blob referenced under `refs/gatos/audit/proofs/governance/`; the `Proof-Of-Consensus` trailer MUST equal `blake3(envelope_bytes)`. + - Storage: The canonical PoC envelope JSON MUST be persisted as a blob referenced under `refs/gatos/audit/proofs/governance/`; the `Proof-Of-Consensus` trailer MUST equal `blake3(envelope_bytes)`. 8. Governance schema (policy integration) - Extend `.gatos/policy.yaml` to declare governance rules (JSON Schema: [`schemas/v1/policy/governance_policy.schema.json`](../../../schemas/v1/policy/governance_policy.schema.json)): diff --git a/docs/decisions/ADR-0004/DECISION.md b/docs/decisions/ADR-0004/DECISION.md index 13804d5e..151b5275 100644 --- a/docs/decisions/ADR-0004/DECISION.md +++ b/docs/decisions/ADR-0004/DECISION.md @@ -93,7 +93,7 @@ classDiagram +string kind: "opaque_pointer" +string algo: "blake3" +string digest: "blake3:" // plaintext digest - +string ciphertext_digest "blake3:" // MAY be present + +string ciphertext_digest: "blake3:" // MAY be present +int size // SHOULD be present (bytes) +string location +string capability // MUST NOT embed secrets @@ -171,8 +171,8 @@ A client resolving an Opaque Pointer **MUST** follow this protocol: Response headers on success: ``` Content-Type: application/octet-stream -X-BLAKE3-Digest: blake3: -Digest: sha-256= +X-BLAKE3-Digest: blake3: +Digest: sha-256= ``` Optional HTTP Message Signatures profile (RFC 9421): diff --git a/schemas/v1/privacy/opaque_pointer.schema.json b/schemas/v1/privacy/opaque_pointer.schema.json index bb8d3475..9e899fc9 100644 --- a/schemas/v1/privacy/opaque_pointer.schema.json +++ b/schemas/v1/privacy/opaque_pointer.schema.json @@ -13,6 +13,20 @@ "capability": { "type": "string", "format": "uri" }, "extensions": { "type": "object" } }, - "required": ["kind","algo","digest","location","capability"], + "required": ["kind","algo","location","capability"], + "anyOf": [ + { + "required": ["digest"], + "properties": { + "digest": { "type": "string", "pattern": "^blake3:[a-f0-9]{64}$" } + } + }, + { + "required": ["ciphertext_digest"], + "properties": { + "ciphertext_digest": { "type": "string", "pattern": "^blake3:[a-f0-9]{64}$" } + } + } + ], "additionalProperties": false }