The PolyAgent Agent Core is a high-performance Rust implementation of the agent execution layer, providing secure sandboxing, efficient memory management, and intelligent tool orchestration. This document describes the modernized architecture following 2025 best practices.
- Separation of Concerns: Intelligence (Python) vs Execution (Rust)
- Zero-Copy Operations: Minimize string cloning and memory allocations
- Modern Concurrency: Use
OnceLockandstd::sync::Onceinstead oflazy_static - Comprehensive Error Handling:
Result<T>types with structured errors viathiserror - Observable Systems: OpenTelemetry tracing and Prometheus metrics
- Security First: WASI sandboxing for untrusted code execution
┌─────────────────────────────────────────────────────────────┐
│ gRPC Server (port 50051) │
├─────────────────────────────────────────────────────────────┤
│ Enforcement Gateway │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────────┐ │
│ │ Timeouts │ │ Rate Limits │ │ Circuit Breakers │ │
│ └──────────────┘ └──────────────┘ └──────────────────┘ │
├─────────────────────────────────────────────────────────────┤
│ Tool Execution Layer │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Tool │ │ Tool │ │ Tool │ │ WASI │ │
│ │ Registry │ │ Cache │ │ Executor │ │ Sandbox │ │
│ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │
├─────────────────────────────────────────────────────────────┤
│ Infrastructure Layer │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Memory │ │ Config │ │ Tracing │ │ Metrics │ │
│ │ Pool │ │ Manager │ │ (OTEL) │ │ (Prom) │ │
│ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │
└─────────────────────────────────────────────────────────────┘
Uniform per-request policy enforcement for every code path:
- Timeouts: hard wall clock limit per request
- Token ceiling: reject requests with excessive estimated tokens
- Rate limiting: simple per-key token bucket
- Circuit breaker: rolling error window per key
- Optional distributed limiter: set
ENFORCE_RATE_REDIS_URLto enable a Redis-backed token bucket shared across instances.
- Optional distributed limiter: set
Configuration lives under enforcement in config/agent.yaml with environment variable overrides (ENFORCE_*).
- Centralized tool capability management
- Discovery API with filtering and relevance scoring
- Metadata including schemas, permissions, and TTL
- LRU caching with configurable TTL
- Deterministic cache key generation
- Automatic expiration and sweeping
- Comprehensive statistics tracking
- Unified interface for tool execution
- Integration with Python LLM service
- WASI sandbox routing for code execution
- Automatic result caching
Secure WebAssembly execution environment with:
- Filesystem isolation (read-only
/tmpaccess) - Memory limits (configurable, default 256MB)
- Execution timeouts (default 30s)
- Fuel metering for CPU usage control
Efficient memory pool with:
- Pre-allocated memory blocks
- Automatic garbage collection
- Pressure-based rejection
- Thread-safe allocation/deallocation
Centralized configuration management:
- YAML-based configuration files
- Environment variable overrides (including enforcement:
ENFORCE_*) - Hot-reload support (future)
- Structured configuration types
- OpenTelemetry integration
- W3C trace context propagation
- Active span context injection
- Cross-service tracing support
- Prometheus metrics export
- Tool execution metrics
- Memory usage tracking
- Cache performance stats
- Enforcement metrics: drops by reason, allowed outcomes
The agent exposes the following gRPC services:
service AgentService {
rpc ExecuteTask(ExecuteTaskRequest) returns (ExecuteTaskResponse);
rpc StreamExecuteTask(ExecuteTaskRequest) returns (stream TaskUpdate);
rpc GetCapabilities(GetCapabilitiesRequest) returns (GetCapabilitiesResponse);
rpc HealthCheck(HealthCheckRequest) returns (HealthCheckResponse);
rpc DiscoverTools(DiscoverToolsRequest) returns (DiscoverToolsResponse);
rpc GetToolCapability(GetToolCapabilityRequest) returns (GetToolCapabilityResponse);
}The Rust agent communicates with Python LLM service via HTTP:
POST /tools/select
{
"task": "string",
"context": {},
"exclude_dangerous": boolean,
"max_tools": number
}
POST /tools/execute
{
"tool_name": "string",
"parameters": {}
}
POST /analyze_task
{
"query": "string",
"context": {}
}
Comprehensive error taxonomy using thiserror:
pub enum AgentError {
ToolNotFound { name: String },
ToolExecutionFailed { tool: String, reason: String },
MemoryExhausted { requested: usize, available: usize },
SandboxViolation { operation: String },
ConfigurationError(String),
NetworkError(String),
// ... 20+ error variants
}Using Cow<str> for string handling to avoid unnecessary allocations:
pub fn process_text<'a>(input: &'a str) -> Cow<'a, str>Modern OnceLock pattern for metrics:
static METRICS: OnceLock<HashMap<String, Counter>> = OnceLock::new();Concurrent tool execution with tokio:
let futures = tools.iter().map(|tool| executor.execute_tool(tool));
let results = futures::future::join_all(futures).await;- Tool result caching with configurable TTL
- LLM response caching for simple queries
- Discovery result caching
- No network access
- Limited filesystem access (read-only
/tmp) - Memory limits enforced
- CPU usage controlled via fuel metering
pub struct ToolCapability {
pub required_permissions: Vec<String>,
pub is_dangerous: bool,
pub requires_confirmation: bool,
}- Parameter schema validation
- Size limits on inputs
- Timeout protection
- Component-level testing
- Mock dependencies
- Property-based testing for complex logic
- Python-Rust contract validation
- End-to-end tool execution
- Cache behavior verification
- Error handling scenarios
- Benchmark critical paths
- Memory usage profiling
- Concurrent execution stress tests
FROM rust:1.75 as builder
WORKDIR /app
COPY . .
RUN cargo build --release
FROM debian:bookworm-slim
COPY --from=builder /app/target/release/shannon-agent-core /usr/local/bin/
EXPOSE 50051 2113
CMD ["shannon-agent-core"]Environment variables:
RUST_LOG: Logging levelOTEL_EXPORTER_OTLP_ENDPOINT: Tracing endpointMEMORY_POOL_SIZE_MB: Memory pool sizeWASI_MEMORY_LIMIT_MB: WASI sandbox memory limitTOOL_CACHE_TTL_SECONDS: Default cache TTL
- gRPC health endpoint:
:50051/health - Metrics endpoint:
:2113/metrics
// Old
lazy_static! {
static ref METRICS: Mutex<HashMap<String, Counter>> = Mutex::new(HashMap::new());
}
// New
static METRICS: OnceLock<Mutex<HashMap<String, Counter>>> = OnceLock::new();// Old
let result = operation().unwrap();
// New
let result = operation().context("Failed to perform operation")?;// Old
fn process(input: String) -> String
// New
fn process(input: &str) -> Cow<str>- WebAssembly Component Model: Support for WASI Preview 2
- Distributed Caching: Redis integration for cache sharing
- GPU Acceleration: CUDA/ROCm support for ML operations
- Multi-Region Support: Geo-distributed agent deployment
- Advanced Monitoring: Custom metrics and tracing spans
Please follow these guidelines:
- Use
cargo fmtandcargo clippybefore commits - Add tests for new functionality
- Update documentation for API changes
- Follow error handling best practices
- Minimize unnecessary allocations
Copyright 2025 PolyAgent Project. All rights reserved.