PolyAgent Testing Guide

Comprehensive guide for testing the PolyAgent platform, from unit tests to end-to-end scenarios.

Test Types

Unit Tests

Go: Co-located with code as _test.go files
Rust: Inline #[cfg(test)] modules inside crates
Python: Located in python/llm-service/tests/

Integration Tests

Temporal Workflows: In-memory execution using Temporal testsuite with stubbed activities
Smoke Tests: Basic cross-service connectivity, persistence, metrics validation
End-to-End (E2E): Multi-service scenarios under tests/ with Docker Compose

Quick Start

# 1. Run all unit tests
make test

# 2. Start the full stack
make dev

# 3. Run smoke test (health, gRPC, persistence, metrics)
make smoke

# 4. Run E2E scenarios
tests/e2e/run.sh

Detailed Test Commands

Unit Testing by Language

# Go tests with race detection
cd go/orchestrator && go test -race ./...

# Rust tests with output
cd rust/agent-core && cargo test -- --nocapture

# Python tests with coverage
cd python/llm-service && python3 -m pytest --cov

# WASI sandbox test
wat2wasm docs/assets/hello-wasi.wat -o /tmp/hello-wasi.wasm
cd rust/agent-core && cargo run --example wasi_hello -- /tmp/hello-wasi.wasm

Temporal Workflow Testing

# Export workflow history
make replay-export WORKFLOW_ID=task-dev-XXX OUT=history.json

# Test determinism
make replay HISTORY=history.json

# Run all replay tests
make ci-replay

Smoke Test Details

The smoke test (make smoke) validates:

Temporal UI reachability (http://localhost:8088)
Agent-Core gRPC health and ExecuteTask
Orchestrator SubmitTask + GetTaskStatus progression
LLM service health/live/ready endpoints
Qdrant readiness and required collections
PostgreSQL connectivity and migrations
Prometheus metrics endpoints (:2112, :2113)

Manual Service Verification

# Agent-Core health check
grpcurl -plaintext localhost:50051 shannon.agent.AgentService/HealthCheck

# Orchestrator submit task
grpcurl -plaintext \
  -d '{"metadata":{"user_id":"dev","session_id":"s1"},"query":"Say hello"}' \
  localhost:50052 shannon.orchestrator.OrchestratorService/SubmitTask

# LLM service health
curl -fsS http://localhost:8000/health
curl -fsS http://localhost:8000/health/ready

# Metrics endpoints
curl http://localhost:2112/metrics | head  # Orchestrator
curl http://localhost:2113/metrics | head  # Agent Core

Test File Locations

Go Tests

go/orchestrator/internal/workflows/*_test.go - Workflow tests
go/orchestrator/internal/activities/*_test.go - Activity tests
go/orchestrator/internal/session/manager_test.go - Session management
go/orchestrator/tests/replay/workflow_replay_test.go - Replay validation

Rust Tests

rust/agent-core/src/memory.rs - TTL, limits, LRU eviction
rust/agent-core/src/wasi_sandbox.rs - Path validation, sandboxed FS
rust/agent-core/src/tools.rs - Tool executor tests

Python Tests

python/llm-service/tests/test_manager.py - Provider routing, cache, fallback
python/llm-service/tests/test_ratelimiter.py - Rate limiting
python/llm-service/tests/test_tools.py - Tool execution

CI Pipeline

GitHub Actions runs:

Rust tests with clippy linting
Go build and tests with race detection
Python linting (ruff) and pytest
Temporal workflow replay tests
Coverage reporting (informational)

Coverage Requirements & Setup

Coverage Thresholds

Go: Minimum 50% coverage (current: ~57%)
Python: Baseline 20% coverage (current: ~15%, target: 70%)
Rust: Informational only (no minimum yet)

Running Coverage Tests

# Run all coverage gates
make coverage-gate

# Individual language coverage
make coverage-go       # Go coverage with threshold check
make coverage-python   # Python coverage with venv setup

# For Rust coverage, use cargo tarpaulin directly:
cd rust/agent-core && cargo tarpaulin --out Html

# Generate detailed reports
cd go/orchestrator && go test -coverprofile=coverage.out -covermode=atomic ./...
cd python/llm-service && pytest --cov=. --cov-report=html

Well-Covered Modules

Go Budget Manager: 81%+ coverage
Go Circuit Breaker: 61%+ coverage
Python Rate Limiter: Good timing test coverage
Rust Memory Manager: LRU eviction and TTL tests

Coverage Goals

Current focus areas:

Gateway enforcement: timeouts, rate limits, circuit breakers
Budget manager: edge cases, DB operations, token tracking
LLM routing: provider fallback, rate limiting, caching
WASI sandbox: path traversal protection, resource limits
Pattern workflows: CoT, ToT, ReAct, Debate, Reflection

Troubleshooting

Common Issues

Orchestrator cannot connect to Temporal

Verify TEMPORAL_HOST=temporal:7233 in environment
Check Temporal worker is started: docker compose logs temporal

PostgreSQL migration failures

docker compose down -v  # Clear volumes
make dev                # Start fresh

LLM service not ready

Provider API keys are optional in dev mode
Service may show ready=false without keys but still functions

Port conflicts Ensure these ports are free:

50051 (Agent-Core gRPC)
50052 (Orchestrator gRPC)
8000 (LLM Service HTTP)
8088 (Temporal UI)
5432 (PostgreSQL)
6379 (Redis)

Debugging Commands

# View all logs
make logs

# View specific service logs
docker compose logs -f orchestrator
docker compose logs -f agent-core
docker compose logs -f llm-service

# Check database state
docker compose exec postgres psql -U shannon -d shannon \
  -c "SELECT workflow_id, status FROM tasks ORDER BY created_at DESC LIMIT 5;"

# Check Redis sessions
docker compose exec redis redis-cli KEYS "session:*"

# List Temporal workflows
docker compose exec temporal temporal workflow list --address temporal:7233

Performance Testing

# Load test with concurrent requests
for i in {1..10}; do
  ./scripts/submit_task.sh "Test query $i" &
done
wait

# Monitor metrics during load
watch -n 1 'curl -s http://localhost:2112/metrics | grep polyagent_'

Cleanup

# Stop services
make down

# Clean all data (including volumes)
make clean

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PolyAgent Testing Guide

Test Types

Unit Tests

Integration Tests

Quick Start

Detailed Test Commands

Unit Testing by Language

Temporal Workflow Testing

Smoke Test Details

Manual Service Verification

Test File Locations

Go Tests

Rust Tests

Python Tests

CI Pipeline

Coverage Requirements & Setup

Coverage Thresholds

Running Coverage Tests

Well-Covered Modules

Coverage Goals

Troubleshooting

Common Issues

Debugging Commands

Performance Testing

Cleanup

FilesExpand file tree

testing.md

Latest commit

History

testing.md

File metadata and controls

PolyAgent Testing Guide

Test Types

Unit Tests

Integration Tests

Quick Start

Detailed Test Commands

Unit Testing by Language

Temporal Workflow Testing

Smoke Test Details

Manual Service Verification

Test File Locations

Go Tests

Rust Tests

Python Tests

CI Pipeline

Coverage Requirements & Setup

Coverage Thresholds

Running Coverage Tests

Well-Covered Modules

Coverage Goals

Troubleshooting

Common Issues

Debugging Commands

Performance Testing

Cleanup