| title | KCP Protocol Specification | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| description | Full specification of the Knowledge Context Protocol (KCP) — Layer 8 for persistent governance, discovery, and lineage tracking of AI-generated knowledge. | |||||||||
| tags |
|
|||||||||
| version | 0.2 | |||||||||
| status | draft | |||||||||
| updated | 2026-03-22 |
Status: Draft
Date: March 2026
Author: Thiago Silva (contato@kcp-protocol.org)
This document specifies the Knowledge Context Protocol (KCP), an application-layer protocol for persistent governance, discovery, and lineage tracking of knowledge outputs generated by AI agents. KCP introduces a new layer in the network stack (Layer 8: Context & Knowledge) that sits above the OSI Application Layer (Layer 7).
The proliferation of AI agents (LLMs, code assistants, analytics tools) has created an explosion of generated knowledge (reports, analyses, insights). However, current infrastructure treats these outputs as ephemeral data:
- Knowledge disappears when sessions end
- No mechanism for discovery ("has this been analyzed before?")
- No lineage tracking (source → analysis → decision)
- No multi-tenant governance (who can see what, based on business context)
KCP addresses these gaps by defining:
- A standard payload format for knowledge artifacts
- A protocol for publishing, discovering, and retrieving knowledge
- A multi-tenant governance model
- A lineage tracking mechanism
- A federated P2P storage architecture
- Knowledge Artifact: Any output generated by an AI agent (report, analysis, visualization, code, etc.)
- Tenant: An organization or isolated context (e.g., company, open-source project)
- Team: A subgroup within a tenant (e.g., engineering team, data science team)
- Lineage: The chain from data sources → queries → insights → decisions
- Visibility Tier: Access control level (public, org, team, private)
┌─────────────────────────────────────────┐
│ Layer 8: Context & Knowledge (KCP) │ ← New Layer
├─────────────────────────────────────────┤
│ Layer 7: Application (HTTP, etc.) │
├─────────────────────────────────────────┤
│ Layers 1-6: Traditional Stack │
└─────────────────────────────────────────┘
| Operation | Method | Endpoint | Description |
|---|---|---|---|
| Publish | POST | /kcp/v1/artifacts |
Submit a knowledge artifact |
| Search | GET | /kcp/v1/artifacts?q=... |
Search by keywords/tags |
| Retrieve | GET | /kcp/v1/artifacts/{id} |
Get artifact metadata |
| Download | GET | /kcp/v1/artifacts/{id}/content |
Get artifact content |
| Delete | DELETE | /kcp/v1/artifacts/{id} |
Soft-delete artifact |
{
"id": "uuid-v4",
"version": "1",
"user_id": "string",
"tenant_id": "string",
"team": "string (optional)",
"tags": ["string"],
"source": "string (agent identifier)",
"timestamp": "ISO 8601 datetime",
"format": "html | json | markdown | pdf | png",
"visibility": "public | org | team | private",
"title": "string",
"summary": "string (max 500 chars)",
"lineage": {
"query": "string (human-readable description)",
"data_sources": ["uri"],
"agent": "string",
"parent_reports": ["uuid"]
},
"content_url": "uri (ipfs:// | https:// | file://)",
"content_hash": "sha256 hex",
"embeddings": [float] (optional, for semantic search),
"signature": "ed25519 signature",
"acl": {
"allowed_tenants": ["string"],
"allowed_users": ["string"],
"allowed_teams": ["string"]
}
}- id: Unique identifier (UUID v4)
- version: Protocol version (currently "1")
- user_id: Creator's identifier (email, username, or DID)
- tenant_id: Organization/project identifier
- team: Optional subgroup within tenant
- tags: List of keywords for discovery
- source: Agent that generated the artifact (name + version)
- timestamp: Creation time (UTC, ISO 8601)
- format: MIME type category
- visibility: Access control tier (see section 4)
- title: Human-readable title
- summary: Brief description (used in search results)
- lineage: Provenance information
- query: What question was answered
- data_sources: Input data URIs
- agent: Agent that performed analysis
- parent_reports: Reports this builds upon
- content_url: Where content is stored
- content_hash: SHA-256 of content (for integrity)
- embeddings: Vector representation (optional, for semantic search)
- signature: Ed25519 signature of payload (excluding signature field)
- acl: Fine-grained access control (overrides visibility)
| Tier | Access Rule | Example Use Case |
|---|---|---|
| public | Anyone can read | Whitepapers, open documentation |
| org | Anyone in tenant_id can read |
Internal architecture docs |
| team | Anyone in tenant_id + team can read |
Squad metrics, postmortems |
| private | Only user_id + explicit ACL can read |
Draft analyses, sensitive data |
For fine-grained control beyond visibility tiers:
"acl": {
"allowed_tenants": ["acme-corp", "partner-org"],
"allowed_users": ["alice@example.com", "bob@example.com"],
"allowed_teams": ["team:engineering", "team:data-science"]
}Rules:
- If
aclis present, it overridesvisibility - Access granted if user matches any of:
allowed_users,allowed_teams, orallowed_tenants - Empty ACL = no additional permissions (fall back to visibility)
Data Sources → Query/Agent → Knowledge Artifact → Decision
↓ ↓ ↓ ↓
[URIs] [source field] [this artifact] [parent_reports]
{
"lineage": {
"query": "Calculate average response time for API endpoints",
"data_sources": [
"prometheus://prod-cluster/metrics",
"grafana://api/dashboards/xyz"
],
"agent": "monitoring-agent-v2.1",
"parent_reports": [
"ab12cd34-...", // Previous week's report
"ef56gh78-..." // Baseline performance report
]
}
}Traversal: By following parent_reports chains, systems can reconstruct full provenance graphs.
GET /kcp/v1/artifacts?q=<keywords>&tenant_id=<tenant>&team=<team>&tags=<tag1,tag2>&from=<date>&to=<date>
Parameters:
q(optional): Full-text search querytenant_id(optional): Filter by tenantteam(optional): Filter by teamtags(optional): Comma-separated tagsfrom,to(optional): Date range (ISO 8601)limit,offset(optional): Pagination
{
"results": [
{
"id": "uuid",
"title": "string",
"summary": "string",
"created_at": "ISO 8601",
"relevance": 0.94,
"preview": "string (first 200 chars)"
}
],
"total": 42,
"query_time_ms": 12
}If embeddings are provided in the payload:
- Client generates embedding for search query
- Server performs vector similarity search
- Results ranked by cosine similarity
- Distributed: No single point of failure
- Content-Addressed: Content hash = retrieval key
- Encrypted: At-rest encryption (AES-256-GCM)
- Signed: All artifacts must have valid Ed25519 signatures
- Efficient: Support for large files (videos, datasets)
- Content-addressed by default
- P2P discovery via DHT
- Pinning for availability
- SQLite fork with replication
- Encryption via SQLCipher
- Familiar SQL interface
- Append-only log (Git-like)
- Single-file database (
.kcpextension) - Merkle tree for integrity
- Full spec in Appendix A (future)
Threats:
- Unauthorized access to private artifacts
- Data tampering (modifying existing artifacts)
- Impersonation (publishing as another user)
- Replay attacks
- Denial of service
| Threat | Mitigation |
|---|---|
| Unauthorized access | Multi-tenant ACLs + encryption at rest |
| Tampering | Content hashing (SHA-256) + signature verification |
| Impersonation | Ed25519 signatures (user keypair) |
| Replay | Timestamp validation (reject if > 5 min old) |
| DoS | Rate limiting (per tenant_id, per user_id) |
User Keypair:
Private Key: ed25519:secret (NEVER transmitted)
Public Key: ed25519:public (stored in user profile)
Node ID: hex(public_key) (32 bytes = 64 hex chars)
Key Generation Methods:
- Mnemonic-based (Recommended) — BIP-39 compatible recovery phrase
- Random generation — Cryptographically secure random 32 bytes
- Import from backup — Restore from encrypted backup file
Users can generate their keypair from a 12-word recovery phrase (BIP-39 standard):
abandon ability able about above absent absorb abstract absurd abuse access accident
Derivation Process:
from mnemonic import Mnemonic
from cryptography.hazmat.primitives.asymmetric.ed25519 import Ed25519PrivateKey
# 1. Generate mnemonic (128 bits entropy = 12 words)
m = Mnemonic("english")
mnemonic = m.generate(128) # "word1 word2 ... word12"
# 2. Derive 64-byte seed via PBKDF2-SHA512 (BIP-39 standard)
seed = m.to_seed(mnemonic, passphrase="") # Optional passphrase
# 3. Use first 32 bytes as Ed25519 private key
private_key = Ed25519PrivateKey.from_private_bytes(seed[:32])
public_key = private_key.public_key()
# 4. Node ID = hex(public_key)
node_id = public_key.public_bytes_raw().hex()Recovery:
- Same mnemonic + passphrase = same keypair = same Node ID
- Users can move between devices by memorizing/storing their 12 words
- Security: Anyone with the mnemonic has full access to the identity
CLI Commands:
kcp identity create # Generate new identity with recovery phrase
kcp identity recover # Restore identity from recovery phrase
kcp identity show # Display current Node ID and fingerprint
kcp identity export # Export to encrypted backup file
kcp identity import # Import from backup fileimport hashlib
import ed25519
# 1. Remove 'signature' field from payload
payload_without_sig = {k: v for k, v in payload.items() if k != 'signature'}
# 2. Canonical JSON (sorted keys, no whitespace)
canonical = json.dumps(payload_without_sig, sort_keys=True, separators=(',', ':'))
# 3. Sign with user's private key
signature = ed25519.sign(canonical.encode('utf-8'), user_private_key)
# 4. Add signature to payload
payload['signature'] = signature.hex()# Server retrieves user's public key
user_public_key = get_user_public_key(payload['user_id'])
# Reconstruct canonical payload
canonical = json.dumps({k: v for k, v in payload.items() if k != 'signature'},
sort_keys=True, separators=(',', ':'))
# Verify signature
ed25519.verify(bytes.fromhex(payload['signature']),
canonical.encode('utf-8'),
user_public_key)┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Node A │◄────►│ Node B │◄────►│ Node C │
│ (acme-corp) │ │ (beta-inc) │ │ (gamma-llc) │
└──────────────┘ └──────────────┘ └──────────────┘
↕ ↕ ↕
[DHT: Distributed Hash Table for discovery]
- Each node announces itself to DHT:
"Node A has reports with tags [X, Y, Z]" - When searching, client queries DHT:
"Who has reports matching tag X?" - DHT returns list of peers
- Client fetches directly from peers (P2P)
1. Node A publishes report with ID=abc123
2. Node A announces to DHT: "I have abc123"
3. Node B queries DHT: "Who has abc123?"
4. DHT responds: "Node A"
5. Node B fetches from Node A (IPFS or HTTP)
- Current version: v0.2
- Version in URL:
/kcp/v1/artifacts(major version only) - Backward compatibility: Servers MUST support all v1.x clients
- Each payload has
"version": "1"field - Future breaking changes increment version (2, 3, etc.)
- Servers MUST reject unsupported versions with
400 Bad Request
| Code | Meaning | Example |
|---|---|---|
| 200 | Success | Report retrieved |
| 201 | Created | Report published |
| 400 | Bad Request | Invalid payload format |
| 401 | Unauthorized | Invalid signature |
| 403 | Forbidden | ACL violation |
| 404 | Not Found | Report doesn't exist |
| 409 | Conflict | Report ID already exists |
| 429 | Rate Limited | Too many requests |
| 500 | Server Error | Internal failure |
{
"error": {
"code": "INVALID_SIGNATURE",
"message": "Ed25519 signature verification failed",
"details": {
"user_id": "alice@example.com",
"report_id": "abc123"
}
}
}- Collaborative Editing: Multiple users edit same artifact (CRDT-based)
- Notifications: Subscribe to new reports matching tags
- Analytics: Usage metrics (most viewed, most cited)
- AI-to-AI Discovery: Agents autonomously discover relevant artifacts
See /sdk directory for reference implementations in Python, TypeScript, and Go.
| Feature | KCP | HTTP | Git | IPFS | RDF/SPARQL |
|---|---|---|---|---|---|
| Knowledge artifacts | ✅ | ❌ | ❌ | ❌ | ❌ |
| Lineage tracking | ✅ | ❌ | Partial | ❌ | ❌ |
| Multi-tenant ACL | ✅ | ❌ | ❌ | ❌ | ❌ |
| Semantic search | ✅ | ❌ | ❌ | ❌ | ✅ |
| P2P federation | ✅ | ❌ | ❌ | ✅ | ❌ |
| Content-addressed | ✅ | ❌ | ✅ | ✅ | ❌ |
- OSI Model — ISO/IEC 7498-1:1994
- Ed25519 — RFC 8032
- IPFS — https://docs.ipfs.tech
- libp2p — https://libp2p.io
- Semantic Web — https://www.w3.org/standards/semanticweb/
Document Status: Experimental Draft
Next Review: May 2026
Feedback: Open an issue at https://github.com/kcp-protocol/kcp