This directory contains performance testing tools and scripts for IronSys.
Python unit-level performance tests.
Location: python/tests/benchmarks/
Run:
cd python
pytest tests/benchmarks/ -v --benchmark-onlyTests:
test_cache_performance.py- Cache operations (get, set, SWR, parallel)test_rate_limiter_performance.py- Rate limiter throughput
Performance Targets:
- Cache operations: > 500 ops/sec
- Parallel cache reads: > 2,000 ops/sec
- Rate limiter checks: > 20,000 ops/sec
- Parallel rate limiting: > 50,000 ops/sec
Full system load testing with realistic traffic patterns.
Location: scripts/performance/load-test.js
Install k6:
# macOS
brew install k6
# Linux
sudo apt-get install k6
# Or use Docker
docker pull grafana/k6Run:
# Local testing
API_BASE_URL=http://localhost:8000 k6 run scripts/performance/load-test.js
# Production-like load
API_BASE_URL=https://api.ironsys.example.com \
TEST_DURATION=10m \
TARGET_RPS=5000 \
k6 run scripts/performance/load-test.js
# Using Docker
docker run --rm -i grafana/k6 run - < scripts/performance/load-test.jsTest Stages:
- Ramp up (2 min): 0 → 50% target RPS
- Steady state (5 min): Maintain target RPS
- Peak load (5 min): 150% target RPS
- Spike test (1.5 min): 300% target RPS for 30s
- Ramp down (2 min): Back to 0
Scenarios:
- 80% GET /slots (read operations)
- 20% POST /reserve (write operations)
Performance Thresholds:
- P95 latency < 500ms
- P99 latency < 1000ms
- Error rate < 1%
- Cache hit rate > 70%
Metrics:
- HTTP request duration (P50, P95, P99, max)
- Error rate
- Throughput (req/s)
- Cache hit rate
- Reservation latency
- Slot read latency
System behavior under extreme conditions.
Location: scripts/performance/stress-test.sh
Prerequisites:
# Install Apache Bench (comes with Apache)
sudo apt-get install apache2-utils
# Or install wrk (recommended)
sudo apt-get install wrk
# Or on macOS
brew install wrkRun:
# Basic stress test
./scripts/performance/stress-test.sh
# Custom configuration
API_BASE_URL=http://localhost:8000 \
CONCURRENT_USERS=2000 \
DURATION=600 \
./scripts/performance/stress-test.shTests:
- Basic stress: High concurrency reads and writes
- Memory stress: Many unique requests to fill cache
- Connection pool stress: Concurrent long-running requests
- Rate limiter stress: Rapid requests to trigger rate limiting
Environment Variables:
API_BASE_URL: API endpoint (default: http://localhost:8000)DURATION: Test duration in seconds (default: 300)CONCURRENT_USERS: Concurrent connections (default: 1000)REQUESTS_PER_USER: Requests per user for ab (default: 100)
| Metric | Target | Acceptable | Critical |
|---|---|---|---|
| P95 Latency | < 200ms | < 500ms | < 1000ms |
| P99 Latency | < 500ms | < 1000ms | < 2000ms |
| Error Rate | < 0.01% | < 0.1% | < 1% |
| Throughput | > 5000 rps | > 2000 rps | > 500 rps |
| Metric | Target | Acceptable | Critical |
|---|---|---|---|
| Hit Rate | > 80% | > 70% | > 50% |
| Get Latency | < 2ms | < 5ms | < 10ms |
| Set Latency | < 2ms | < 5ms | < 10ms |
| Metric | Target | Acceptable | Critical |
|---|---|---|---|
| Query Time | < 10ms | < 50ms | < 100ms |
| Connection Pool | < 50% | < 80% | < 95% |
| Active Connections | < 10 | < 15 | < 20 |
| Metric | Target | Acceptable | Critical |
|---|---|---|---|
| Consumer Lag | < 100 | < 1000 | < 10000 |
| Publish Latency | < 10ms | < 50ms | < 100ms |
| Error Rate | < 0.01% | < 0.1% | < 1% |
See .github/workflows/ci.yml for automated performance testing:
performance-test:
runs-on: ubuntu-latest
steps:
- name: Run benchmark tests
run: pytest tests/benchmarks/ --benchmark-only
- name: Run load tests
run: |
docker-compose up -d
sleep 10
k6 run --vus 100 --duration 60s scripts/performance/load-test.jsMonitor these metrics during load tests:
# Request rate
rate(ironsys_requests_total[1m])
# P95 latency
histogram_quantile(0.95, rate(ironsys_request_duration_seconds_bucket[1m]))
# Error rate
rate(ironsys_requests_total{status=~"5.."}[1m]) /
rate(ironsys_requests_total[1m])
# Cache hit rate
rate(ironsys_cache_hits_total{type="fresh"}[1m]) /
rate(ironsys_cache_hits_total[1m])
# Consumer lag
kafka_consumer_lag
# Circuit breaker state
ironsys_circuit_breaker_state
Import the performance testing dashboard:
kubectl apply -f infra/grafana/dashboards/ironsys-overview.jsonRequests/sec: 5234.56
Transfer/sec: 2.34MB
Latency Distribution:
50%: 123ms
75%: 187ms
90%: 256ms
95%: 342ms
99%: 567ms
Error Rate: 0.02%
Requests/sec: 1234.56
Transfer/sec: 567KB
Latency Distribution:
50%: 456ms
75%: 789ms
90%: 1.2s
95%: 2.3s
99%: 4.5s
Error Rate: 2.5%
Actions:
- Check resource utilization (CPU, memory)
- Check database query performance
- Check cache hit rate
- Check circuit breaker states
- Review application logs
Requests/sec: 234.56
Transfer/sec: 123KB
Failed requests: 1234 (12.3%)
Error Rate: 12.3%
Actions:
- Immediate: Scale up resources
- Check for resource exhaustion (memory, connections)
- Check for cascading failures
- Review circuit breaker states
- Check Kafka consumer lag
# Use SWR for better availability
cached_data, is_stale = await cache.get_with_swr(key)
# Batch cache operations
async with cache.client.pipeline() as pipe:
pipe.get(key1)
pipe.get(key2)
results = await pipe.execute()# Use connection pooling
DB_POOL_SIZE = 20
DB_MAX_OVERFLOW = 10
# Use prepared statements
await conn.fetchrow("SELECT * FROM slots WHERE id = $1", slot_id)
# Batch operations
async with conn.transaction():
await conn.executemany("INSERT INTO ...", data)# Batch messages
producer.send_batch(messages, partition_key=slot_id)
# Tune consumer settings
KAFKA_MAX_POLL_RECORDS = 100
KAFKA_MAX_POLL_INTERVAL_MS = 300000# Use async/await
async def get_slot(slot_id: UUID):
return await db.fetchrow("SELECT * FROM slots WHERE id = $1", slot_id)
# Enable compression
app.add_middleware(GZipMiddleware, minimum_size=1000)
# Use connection pooling
uvicorn.run(app, workers=4)- Check database query performance
- Check cache hit rate
- Check network latency
- Check resource utilization
- Check circuit breaker states
- Check rate limiter configuration
- Check database connections
- Check Kafka connectivity
- Check worker concurrency
- Check connection pool size
- Check resource limits (CPU, memory)
- Check network bandwidth