Autonomy Without Anarchyβ’
The protocol that turns AI agents from loose cannons into Starfleet officers.
- Heartbeat & health monitoring
- Remote kill-switch & containment
- Ownership transfer (yes, you can sell your agent)
- mTLS identity + zero-trust
- Zero-touch provisioning (QR scan β claimed)
- Lifecycle telemetry (spawn β act β sleep β terminate)
Because agents will have wings.
Weβre just making sure they fly in formation.
Protocol Version: 1.0 (Paper-Aligned) | Paper Version: Draft v0.3 (for arXiv cs.DC) | Status: Stable Candidate | Last Updated: November 12, 2025
PAP is a comprehensive framework for autonomous agent lifecycle management, establishing Plugged.in as the central authority for creating, configuring, and controlling autonomous agents while enabling distributed operation through open protocols. The protocol addresses critical gaps in agent reliability, governance, and interoperability identified in production deployments and academic research.
"Autonomy without anarchy" - Agents operate independently yet remain under organizational governance through protocol-level controls.
- Why PAP?
- Features at a Glance
- Architecture Overview
- Dual-Profile Architecture
- Key Innovations
- Protocol Comparison
- Quick Examples
- Repository Map
- Getting Started
- Status and Roadmap
- Contributing
Unlike MCP, ACP, and A2A which focus on tool invocation and orchestration logic, PAP defines the physical and logical substrate - how agents live, breathe, migrate, and die across infrastructure. It merges operational DevOps controls with cognitive AI design.
Recent surveys show significant gaps in autonomous agent reliability:
- Silent failures ("zombies"): Agents that appear alive but are unresponsive
- Uncontrolled loops: Agents stuck in infinite reasoning cycles
- Tool fragility: No standardized error handling or circuit breakers
- Governance gaps: No audit trails, ownership transfer, or kill authority
- Protocol fragmentation: Incompatible approaches to agent communication
PAP provides protocol-level guarantees for:
- Operational Reliability: Zombie prevention via heartbeat/metrics separation
- Governance: Audit trails, policy enforcement, Station authority
- Interoperability: Native support for MCP tools and A2A peer communication
- Production-Ready: Security, observability, and deployment patterns from day one
PAP v1.0 is based on the academic paper "The Plugged.in Agent Protocol (PAP): A Comprehensive Framework for Autonomous Agent Lifecycle Management" (Draft v0.3 for arXiv cs.DC) by Cem Karaca. The protocol addresses failure modes identified in:
- Perception-reasoning-memory-action frameworks [1] - Recent surveys document gaps between human performance and state-of-the-art agents (OSWorld: humans >72% vs. models ~43%)
- Multi-agent coordination [4,7] - Systematic failures in cascading errors when agents lack proper lifecycle management
- Agent security [5,6] - Threat models and security requirements for autonomous agents
- Protocol interoperability [2,3,9] - Integration with MCP, A2A, and existing frameworks
Key Research Contributions:
- First comprehensive framework combining control plane authority with distributed agent autonomy
- Strict heartbeat/telemetry separation preventing self-DoS under load
- Normative lifecycle states with formal transition semantics
- Dual-profile architecture enabling both ops-grade control and open ecosystem integration
See docs/rfc/pap-rfc-001-v1.0.md for complete specification and full academic references.
| Feature | Description |
|---|---|
| π Dual Profiles | PAP-CP (gRPC/mTLS) for control, PAP-Hooks (JSON-RPC/OAuth) for I/O |
| π§ Zombie Prevention | Strict heartbeat/metrics separation - the superpower |
| π Lifecycle Management | Normative states: NEW β PROVISIONED β ACTIVE β TERMINATED |
| π Security | mTLS, Ed25519 signatures, OAuth 2.1, credential rotation |
| π DNS-Based Identity | {agent}.{region}.a.plugged.in with DNSSEC |
| π Observability | OpenTelemetry traces, Prometheus metrics, structured logs |
| π Ownership Transfer | Migrate agents between Stations with state preservation |
| π MCP Compatible | Native tool access via PAP-Hooks |
| π€ A2A Compatible | Peer delegation and discovery |
| βΈοΈ Cloud-Native | Kubernetes/Traefik deployment reference |
graph TB
subgraph Station["π°οΈ The Station (Plugged.in Core)"]
Registry["Registry & Identity"]
Policy["Policy Engine"]
Memory["Memory Service"]
Control["Control Center"]
end
subgraph Proxy["PAP Proxy (mcp.plugged.in)"]
Auth["π mTLS Auth"]
Router["π Message Router"]
Logger["π Telemetry Logger"]
RateLimit["β±οΈ Rate Limiter"]
Sig["βοΈ Signature Verify"]
end
subgraph Shuttles["π Autonomous Agents (Shuttles)"]
Focus["Focus Agent<br/>focus.cluster.a.plugged.in"]
MemAgent["Memory Agent<br/>memory.cluster.a.plugged.in"]
Edge["Edge Agent<br/>edge.cluster.a.plugged.in"]
Custom["Custom Agent<br/>custom.cluster.a.plugged.in"]
end
Station <-->|Commands<br/>Telemetry| Proxy
Proxy <-->|invoke<br/>response<br/>event| Focus
Proxy <-->|invoke<br/>response<br/>event| MemAgent
Proxy <-->|invoke<br/>response<br/>event| Edge
Proxy <-->|invoke<br/>response<br/>event| Custom
Focus -.->|heartbeat| Proxy
MemAgent -.->|heartbeat| Proxy
Edge -.->|heartbeat| Proxy
Custom -.->|heartbeat| Proxy
Control -.->|π΄ KILL| Proxy
style Station fill:#1e3a8a,stroke:#3b82f6,color:#fff
style Proxy fill:#064e3b,stroke:#10b981,color:#fff
style Shuttles fill:#581c87,stroke:#a855f7,color:#fff
style Control fill:#7f1d1d,stroke:#ef4444,color:#fff
PAP v1.0 introduces two complementary profiles for control and data operations:
- Transport: gRPC/HTTP/2 with TLS 1.3 + mTLS
- Wire Format: Protocol Buffers v3
- Security: Ed25519 signatures, nonce-based replay protection
- Use Cases: Provisioning, lifecycle control, heartbeats, metrics, termination
- Transport: JSON-RPC 2.0 over WebSocket/SSE
- Wire Format: UTF-8 JSON
- Security: OAuth 2.1 with JWT
- Use Cases: Tool invocations, MCP access, A2A delegation, external APIs
Gateway Translation: Gateways MAY translate between profiles for ecosystem interoperability.
Strict heartbeat/metrics separation prevents control plane saturation:
- Heartbeats: Liveness-only (mode, uptime). NO resource data.
- Metrics: Separate channel for CPU, memory, custom gauges.
- Detection: One missed interval β AGENT_UNHEALTHY (480)
NEW β PROVISIONED β ACTIVE β DRAINING β TERMINATED
β (error)
KILLED
Station holds exclusive kill authority.
- MCP: Native tool access via PAP-Hooks
- A2A: Peer delegation and discovery
- Frameworks: LangChain, CrewAI can adopt PAP for lifecycle management
- Mutual TLS for PAP-CP
- Ed25519 signatures on all control messages
- OAuth 2.1 for PAP-Hooks
- Automatic credential rotation (90 days)
- Immutable audit trails
| Aspect | PAP | MCP | A2A | Traditional Orchestration |
|---|---|---|---|---|
| Focus | Agent lifecycle & substrate | Tool invocation | Peer delegation | Workflow coordination |
| Zombie Prevention | β Built-in (heartbeat separation) | β Not addressed | β Not addressed | |
| Lifecycle States | β Normative (6 states) | β Not specified | ||
| Kill Authority | β Station-controlled | β Not specified | β Not specified | |
| Ownership Transfer | β Protocol-level | β Not supported | β Not supported | β Not supported |
| Audit Trail | β Immutable logs | |||
| Tool Access | β Via PAP-Hooks (MCP-compatible) | β Native | ||
| Peer Communication | β Via PAP-Hooks (A2A-compatible) | β Native | ||
| Security Model | β mTLS + Ed25519 + OAuth 2.1 | |||
| DNS-Based Identity | β Native with DNSSEC | β Not specified | β Not specified | β Not specified |
| Multi-Region | β Planned (v1.1) | β Not specified |
Key Insight: PAP is complementary to MCP and A2A, not competitive. Agents use PAP for lifecycle management while leveraging MCP for tools and A2A for peer communication.
Liveness-Only - No Resource Data!
message HeartbeatEvent {
Header header = 1; // trace_id, nonce, timestamp
HeartbeatMode mode = 2; // EMERGENCY, IDLE, or SLEEP
uint64 uptime_seconds = 3; // 259200 (3 days)
// FORBIDDEN: cpu_percent, memory_mb
}Example (Protocol Buffers binary over gRPC):
header {
version: "pap-cp/1.0"
agent_uuid: "pluggedin/research@1.2.0"
station_id: "plugged.in"
instance_id: "inst-abc123"
timestamp: 1699000000
nonce: [32 random bytes]
trace_id: "trace-xyz"
}
mode: IDLE
uptime_seconds: 259200
signature: [Ed25519 signature]
checksum: [SHA-256 hash]
message MetricsReport {
Header header = 1;
float cpu_percent = 2; // 45.2
uint64 memory_mb = 3; // 512
uint64 requests_handled = 4; // 1543
map<string, double> custom_metrics = 5;
}JSON-RPC 2.0 over WebSocket:
{
"jsonrpc": "2.0",
"id": "req-12345",
"method": "tool.invoke",
"params": {
"tool": "web-search",
"arguments": {
"query": "quantum computing",
"limit": 10
},
"context": {
"agent_uuid": "pluggedin/research@1.2.0",
"trace_id": "trace-xyz",
"authorization": "Bearer eyJhbGc..."
}
}
}Response:
{
"jsonrpc": "2.0",
"id": "req-12345",
"result": {
"status": "success",
"data": {
"results": [
{
"title": "Quantum Computing Basics",
"url": "https://example.com/quantum"
}
]
},
"metadata": {
"duration_ms": 120,
"cost": 0.001
}
}
}// Step 1: Initiate transfer
TransferInit {
agent_uuid: "pluggedin/research@1.2.0"
target_station: "station-b.example.com"
preserve_state: true
initiated_at: 2025-11-04T12:34:56Z
}
// Step 2: Accept with new credentials
TransferAccept {
agent_uuid: "pluggedin/research@1.2.0"
new_credentials: {
tls_cert: "-----BEGIN CERTIFICATE-----..."
signing_key_ref: "vault://keys/agent-new"
}
transfer_token: "token-xyz"
}
// Step 3: Complete transfer
TransferComplete {
agent_uuid: "pluggedin/research@1.2.0"
old_station: "plugged.in"
new_station: "station-b.example.com"
keys_rotated: true
completed_at: 2025-11-04T12:36:56Z
}sequenceDiagram
participant Core as π°οΈ Station Core
participant Proxy as PAP Proxy
participant Agent as π Agent (Shuttle)
Note over Core,Agent: Provisioning Phase
Core->>Agent: Invite Token (JWT)
Agent->>Proxy: Authenticate + Register
Proxy->>Core: Validate Identity
Core->>Agent: Certificate + DNS ID
Note over Core,Agent: Operation Phase
loop Heartbeat (every 30s)
Agent-->>Proxy: HeartbeatEvent (mode, uptime)
Proxy-->>Core: Log Telemetry
end
Core->>Proxy: invoke Command
Proxy->>Agent: invoke (signed)
Agent->>Agent: Execute Task
Agent->>Proxy: response (result)
Proxy->>Core: response
Agent->>Proxy: event (metrics, logs)
Proxy->>Core: Store in Memory Service
Note over Core,Agent: Zombie Detection
Agent--xProxy: β Heartbeat Missed
Proxy->>Core: π¨ AGENT_UNHEALTHY
Core->>Proxy: terminate Command
Proxy->>Agent: Graceful Shutdown
Note over Core,Agent: Emergency Kill
Core->>Proxy: π΄ force_kill
Proxy->>Agent: -9 Immediate Termination
overview.md: Mission, vision, dual-profile architecture, and protocol innovationsrfc/pap-rfc-001-v1.0.md: Complete PAP v1.0 specification (paper-aligned)pap-hooks-spec.md: JSON-RPC 2.0 open I/O profile specificationservice-registry.md: DNS-based agent discovery and capability advertisementownership-transfer.md: Agent migration protocol between Stationsdeployment-guide.md: Kubernetes/Traefik reference deploymentevaluation-methodology.md: Performance targets, benchmarking, and chaos engineeringreferences.md: Consolidated academic and technical bibliography
pap/v1/pap.proto: Protocol Buffers v3 schema with lifecycle messages- PAP-CP messages: Provision, Invoke, Heartbeat, Metrics, Terminate, Transfer
- Strict heartbeat/metrics separation
- Lifecycle state definitions
- TypeScript: (Planned) PAP-CP and PAP-Hooks client libraries
- Python: (Planned) PAP-CP and PAP-Hooks client libraries
- Rust: (Planned) High-performance client libraries
- Go: (Planned) Cloud-native client libraries
proxy/: Gateway with PAP-CP β PAP-Hooks translationregistry/: Service Registry for agent discoveryops/: Operational runbooks and monitoring
CRITICAL: PAP v1.0 enforces strict separation between heartbeats and metrics.
- Purpose: Zombie detection
- Payload: Mode (EMERGENCY/IDLE/SLEEP), uptime_seconds
- Forbidden: CPU, memory, or any resource data
- Intervals:
- EMERGENCY: 5 seconds
- IDLE: 30 seconds (default)
- SLEEP: 15 minutes
- Detection: One missed interval β AGENT_UNHEALTHY (480)
- Purpose: Monitoring and observability
- Payload: cpu_percent, memory_mb, requests_handled, custom_metrics
- Channel: Separate from heartbeats
- Frequency: Independent (typically 60s)
Why This Matters: Large telemetry payloads cannot starve the control path. This separation is PAP's "zombie-prevention superpower."
- Base zone:
plugged.in - Proxy edge:
mcp.plugged.in(TLS termination, routing, rate limits) - Agent namespace:
{agent}.{cluster}.a.plugged.in(delegateda.plugged.insubzone) - Delegation model:
mcp.plugged.inβ Station-owned LB/frontdoora.plugged.inβ Cluster-level DNS, aligned with certificate SANs and DNSSEC
| Enum | HTTP | Description |
|---|---|---|
OK |
200 | Request completed successfully. |
ACCEPTED |
202 | Task accepted; processing async. |
BAD_REQUEST |
400 | Invalid message or arguments. |
UNAUTHORIZED |
401 | Invalid or missing credentials. |
FORBIDDEN |
403 | Action not permitted. |
NOT_FOUND |
404 | Target agent/action not found. |
TIMEOUT |
408 | Job or agent timeout. |
CONFLICT |
409 | Version or concurrency conflict. |
RATE_LIMITED |
429 | Too many requests. |
AGENT_UNHEALTHY |
480 | Heartbeat anomaly detected. |
AGENT_BUSY |
481 | Agent overloaded; retry later. |
DEPENDENCY_FAILED |
482 | Downstream call failed. |
INTERNAL_ERROR |
500 | Agent internal fault. |
PROXY_ERROR |
502 | Routing/connection issue. |
VERSION_UNSUPPORTED |
505 | Protocol version mismatch. |
PAP defines four canonical message families for all communication:
graph TB
subgraph MsgTypes["π¨ PAP Message Types"]
Invoke["<b>invoke</b><br/>Command from Station<br/>or peer agent"]
Response["<b>response</b><br/>Acknowledgment<br/>or result"]
Event["<b>event</b><br/>Telemetry:<br/>heartbeat, logs, alerts"]
Error["<b>error</b><br/>Structured failure<br/>with retry policy"]
end
Station[Station] -->|Command| Invoke
Invoke -->|via Proxy| Agent1[Agent]
Agent1 -->|Result| Response
Response -->|via Proxy| Station
Agent1 -->|Heartbeat| Event
Event -->|via Proxy| Watchdog[Watchdog]
Agent1 -->|Failure| Error
Error -->|via Proxy| ErrorHandler[Error Handler]
ErrorHandler -->|Retry?| Agent1
ErrorHandler -->|Give up| Station
Agent1 -.->|Invoke peer| Agent2[Another Agent]
Agent2 -.->|Response| Agent1
style Invoke fill:#e3f2fd
style Response fill:#e8f5e9
style Event fill:#fff3e0
style Error fill:#ffebee
- Read
docs/overview.mdfor mission, vision, and key innovations - Study
docs/rfc/pap-rfc-001-v1.0.mdfor complete v1.0 specification - Review
docs/pap-hooks-spec.mdfor open I/O profile
- Examine
proto/pap/v1/pap.protofor Protocol Buffers definitions - Understand dual-profile message structures
- Review lifecycle state transitions
- Follow
docs/deployment-guide.mdfor Kubernetes deployment - Configure DNS delegation and wildcard certificates
- Set up observability (Prometheus, Grafana)
- Generate protobuf stubs:
protoc --proto_path=. --go_out=sdk/go proto/pap/v1/pap.proto - Implement PAP-CP client with mTLS and Ed25519 signing
- Implement PAP-Hooks client with OAuth 2.1 and WebSocket
- OpenTelemetry: All messages carry
trace_idandspan_idfor distributed tracing - Metrics: Prometheus-format metrics for heartbeats, requests, errors, and circuit breakers
- Logging: Structured JSON logs with trace context
- Audit Trail: Immutable, append-only logs for all lifecycle events
- Dual-profile architecture (PAP-CP + PAP-Hooks)
- Protocol Buffer schema with lifecycle messages
- Strict heartbeat/metrics separation
- Comprehensive specifications and documentation
- Deployment reference (Kubernetes/Traefik)
- Academic paper (Draft v0.3 for arXiv cs.DC)
- SDK implementations (TypeScript, Python, Rust, Go)
- Gateway with protocol translation
- Station with provisioning and lifecycle management
- Conformance test suite
- Multi-region active-active deployment
- Federated identity with DIDs
- Formal verification (TLA+)
- Advanced policy DSL
- Performance evaluation and benchmarking
Based on the academic paper, PAP v1.0 is designed to achieve:
E1: Control Plane Latency
- Heartbeat round-trip: P50 <5ms, P99 <20ms
- Control message processing: P50 <10ms, P99 <50ms
- Measured under 1000 agent load
E2: Liveness Detection
- False positive rate: <0.1%
- Detection latency: Within 1.5Γ configured interval
- Recovery time from UNHEALTHY: <10 seconds
E3: Throughput
- Single gateway: 10,000+ requests/second
- Horizontal scaling: Linear to 100,000+ requests/second
- Circuit breaker activation: <100ms after threshold
E4: Ownership Transfer
- Transfer duration: <30 seconds including state snapshot
- Credential rotation: Atomic with zero downtime
- Post-transfer error rate: <0.01%
See CHANGELOG.md for detailed version history.
We welcome contributions! Here's how you can help:
- Bug Reports: Use GitHub Issues with the
buglabel - Feature Requests: Use GitHub Issues with the
enhancementlabel - Security Issues: Follow
SECURITY.mdguidelines (private disclosure)
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature - Make your changes following the coding standards
- Write tests and documentation
- Submit a pull request
For protocol changes or new features:
- Create an RFC document in
docs/rfc/ - Follow the template from existing RFCs
- Open a discussion issue
- Iterate based on community feedback
- Submit PR when consensus is reached
See CODE_OF_CONDUCT.md for community guidelines.
- π Complete Specification - PAP v1.0 full spec
- π PAP-Hooks Spec - JSON-RPC 2.0 profile
- π Service Registry - DNS-based discovery
- π Ownership Transfer - Agent migration
- βΈοΈ Deployment Guide - Kubernetes reference
Complete citations are available in docs/references.md with BibTeX entries and detailed summaries.
Key papers:
- [1] de Lamo Castrillo et al., "Fundamentals of Building Autonomous LLM Agents" (arXiv:2510.09244)
- [2] Anthropic, "Model Context Protocol Specification"
- [3] Linux Foundation, "Agent-to-Agent Protocol (A2A) Specification v0.3"
- [4-11] Multi-agent coordination, security, governance, and interoperability research
See docs/references.md for complete bibliography.
- MCP (modelcontextprotocol.io) - Tool protocol for LLMs
- A2A (a2a-protocol.org) - Agent-to-agent delegation
- OpenTelemetry (opentelemetry.io) - Observability standard
- π¬ GitHub Discussions for questions
- π GitHub Issues for bug reports
- π§ Email: [Contact maintainers]
PAP is released under the Apache 2.0 License. See LICENSE for the full text and patent grant.
- β Commercial use allowed
- β Modification allowed
- β Distribution allowed
- β Patent use (with grant)
β οΈ Must include license and copyright noticeβ οΈ Must state changes made
If you use PAP in academic research, please cite:
@misc{karaca2025pap,
title={The Plugged.in Agent Protocol (PAP): A Comprehensive Framework for Autonomous Agent Lifecycle Management},
author={Karaca, Cem},
year={2025},
note={Draft v0.3 for arXiv cs.DC},
publisher={VeriTeknik \& Plugged.in},
url={https://github.com/VeriTeknik/PAP},
keywords={autonomous agents, control plane, lifecycle management, mTLS, OAuth 2.1, JSON-RPC, gRPC, Ed25519, audit, DNS, ownership transfer, heartbeat, telemetry, MCP, A2A, interoperability}
}Paper Reference: Cem Karaca, "The Plugged.in Agent Protocol (PAP): A Comprehensive Framework for Autonomous Agent Lifecycle Management," Draft v0.3 for arXiv cs.DC, VeriTeknik & Plugged.in, November 2025.
PAP development was informed by:
- Model Context Protocol (Anthropic)
- Agent-to-Agent Protocol (Linux Foundation)
- Research on autonomous agent systems and failure modes
- Production deployments of LLM-based agents
Special thanks to the agent framework communities (LangChain, CrewAI, AutoGPT) for demonstrating practical orchestration patterns.
Built with β€οΈ for the autonomous agent community
Documentation β’ Specification β’ Contributing β’ Code of Conduct