Name	Name	Last commit message	Last commit date
parent directory ..
README.md	README.md
load-test.js	load-test.js
stress-test.sh	stress-test.sh

Performance Testing Guide

Overview

This directory contains performance testing tools and scripts for IronSys.

Test Types

1. Unit Benchmarks (pytest-benchmark)

Python unit-level performance tests.

Location: python/tests/benchmarks/

Run:

cd python
pytest tests/benchmarks/ -v --benchmark-only

Tests:

test_cache_performance.py - Cache operations (get, set, SWR, parallel)
test_rate_limiter_performance.py - Rate limiter throughput

Performance Targets:

Cache operations: > 500 ops/sec
Parallel cache reads: > 2,000 ops/sec
Rate limiter checks: > 20,000 ops/sec
Parallel rate limiting: > 50,000 ops/sec

2. Load Testing (k6)

Full system load testing with realistic traffic patterns.

Location: scripts/performance/load-test.js

Install k6:

# macOS
brew install k6

# Linux
sudo apt-get install k6

# Or use Docker
docker pull grafana/k6

Run:

# Local testing
API_BASE_URL=http://localhost:8000 k6 run scripts/performance/load-test.js

# Production-like load
API_BASE_URL=https://api.ironsys.example.com \
  TEST_DURATION=10m \
  TARGET_RPS=5000 \
  k6 run scripts/performance/load-test.js

# Using Docker
docker run --rm -i grafana/k6 run - < scripts/performance/load-test.js

Test Stages:

Ramp up (2 min): 0 → 50% target RPS
Steady state (5 min): Maintain target RPS
Peak load (5 min): 150% target RPS
Spike test (1.5 min): 300% target RPS for 30s
Ramp down (2 min): Back to 0

Scenarios:

80% GET /slots (read operations)
20% POST /reserve (write operations)

Performance Thresholds:

P95 latency < 500ms
P99 latency < 1000ms
Error rate < 1%
Cache hit rate > 70%

Metrics:

HTTP request duration (P50, P95, P99, max)
Error rate
Throughput (req/s)
Cache hit rate
Reservation latency
Slot read latency

3. Stress Testing (bash + ab/wrk)

System behavior under extreme conditions.

Location: scripts/performance/stress-test.sh

Prerequisites:

# Install Apache Bench (comes with Apache)
sudo apt-get install apache2-utils

# Or install wrk (recommended)
sudo apt-get install wrk

# Or on macOS
brew install wrk

Run:

# Basic stress test
./scripts/performance/stress-test.sh

# Custom configuration
API_BASE_URL=http://localhost:8000 \
  CONCURRENT_USERS=2000 \
  DURATION=600 \
  ./scripts/performance/stress-test.sh

Tests:

Basic stress: High concurrency reads and writes
Memory stress: Many unique requests to fill cache
Connection pool stress: Concurrent long-running requests
Rate limiter stress: Rapid requests to trigger rate limiting

Environment Variables:

API_BASE_URL: API endpoint (default: http://localhost:8000)
DURATION: Test duration in seconds (default: 300)
CONCURRENT_USERS: Concurrent connections (default: 1000)
REQUESTS_PER_USER: Requests per user for ab (default: 100)

Performance Baselines

API Performance

Metric	Target	Acceptable	Critical
P95 Latency	< 200ms	< 500ms	< 1000ms
P99 Latency	< 500ms	< 1000ms	< 2000ms
Error Rate	< 0.01%	< 0.1%	< 1%
Throughput	> 5000 rps	> 2000 rps	> 500 rps

Cache Performance

Metric	Target	Acceptable	Critical
Hit Rate	> 80%	> 70%	> 50%
Get Latency	< 2ms	< 5ms	< 10ms
Set Latency	< 2ms	< 5ms	< 10ms

Database Performance

Metric	Target	Acceptable	Critical
Query Time	< 10ms	< 50ms	< 100ms
Connection Pool	< 50%	< 80%	< 95%
Active Connections	< 10	< 15	< 20

Kafka Performance

Metric	Target	Acceptable	Critical
Consumer Lag	< 100	< 1000	< 10000
Publish Latency	< 10ms	< 50ms	< 100ms
Error Rate	< 0.01%	< 0.1%	< 1%

Running Performance Tests in CI/CD

GitHub Actions

See .github/workflows/ci.yml for automated performance testing:

performance-test:
  runs-on: ubuntu-latest
  steps:
    - name: Run benchmark tests
      run: pytest tests/benchmarks/ --benchmark-only

    - name: Run load tests
      run: |
        docker-compose up -d
        sleep 10
        k6 run --vus 100 --duration 60s scripts/performance/load-test.js

Monitoring During Tests

Prometheus Metrics

Monitor these metrics during load tests:

# Request rate
rate(ironsys_requests_total[1m])

# P95 latency
histogram_quantile(0.95, rate(ironsys_request_duration_seconds_bucket[1m]))

# Error rate
rate(ironsys_requests_total{status=~"5.."}[1m]) /
rate(ironsys_requests_total[1m])

# Cache hit rate
rate(ironsys_cache_hits_total{type="fresh"}[1m]) /
rate(ironsys_cache_hits_total[1m])

# Consumer lag
kafka_consumer_lag

# Circuit breaker state
ironsys_circuit_breaker_state

Grafana Dashboard

Import the performance testing dashboard:

kubectl apply -f infra/grafana/dashboards/ironsys-overview.json

Interpreting Results

Good Performance

Requests/sec:    5234.56
Transfer/sec:    2.34MB
Latency Distribution:
  50%: 123ms
  75%: 187ms
  90%: 256ms
  95%: 342ms
  99%: 567ms
Error Rate: 0.02%

Degraded Performance

Requests/sec:    1234.56
Transfer/sec:    567KB
Latency Distribution:
  50%: 456ms
  75%: 789ms
  90%: 1.2s
  95%: 2.3s
  99%: 4.5s
Error Rate: 2.5%

Actions:

Check resource utilization (CPU, memory)
Check database query performance
Check cache hit rate
Check circuit breaker states
Review application logs

System Under Stress

Requests/sec:    234.56
Transfer/sec:    123KB
Failed requests: 1234 (12.3%)
Error Rate: 12.3%

Actions:

Immediate: Scale up resources
Check for resource exhaustion (memory, connections)
Check for cascading failures
Review circuit breaker states
Check Kafka consumer lag

Performance Optimization Tips

1. Cache Optimization

# Use SWR for better availability
cached_data, is_stale = await cache.get_with_swr(key)

# Batch cache operations
async with cache.client.pipeline() as pipe:
    pipe.get(key1)
    pipe.get(key2)
    results = await pipe.execute()

2. Database Optimization

# Use connection pooling
DB_POOL_SIZE = 20
DB_MAX_OVERFLOW = 10

# Use prepared statements
await conn.fetchrow("SELECT * FROM slots WHERE id = $1", slot_id)

# Batch operations
async with conn.transaction():
    await conn.executemany("INSERT INTO ...", data)

3. Kafka Optimization

# Batch messages
producer.send_batch(messages, partition_key=slot_id)

# Tune consumer settings
KAFKA_MAX_POLL_RECORDS = 100
KAFKA_MAX_POLL_INTERVAL_MS = 300000

4. API Optimization

# Use async/await
async def get_slot(slot_id: UUID):
    return await db.fetchrow("SELECT * FROM slots WHERE id = $1", slot_id)

# Enable compression
app.add_middleware(GZipMiddleware, minimum_size=1000)

# Use connection pooling
uvicorn.run(app, workers=4)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Performance Testing Guide

Overview

Test Types

1. Unit Benchmarks (pytest-benchmark)

2. Load Testing (k6)

3. Stress Testing (bash + ab/wrk)

Performance Baselines

API Performance

Cache Performance

Database Performance

Kafka Performance

Running Performance Tests in CI/CD

GitHub Actions

Monitoring During Tests

Prometheus Metrics

Grafana Dashboard

Interpreting Results

Good Performance

Degraded Performance

System Under Stress

Performance Optimization Tips

1. Cache Optimization

2. Database Optimization

3. Kafka Optimization

4. API Optimization

Troubleshooting

High Latency

High Error Rate

Low Throughput

References

FilesExpand file tree

performance

Directory actions

More options

Directory actions

More options

Latest commit

History

performance

Folders and files

parent directory

README.md

Performance Testing Guide

Overview

Test Types

1. Unit Benchmarks (pytest-benchmark)

2. Load Testing (k6)

3. Stress Testing (bash + ab/wrk)

Performance Baselines

API Performance

Cache Performance

Database Performance

Kafka Performance

Running Performance Tests in CI/CD

GitHub Actions

Monitoring During Tests

Prometheus Metrics

Grafana Dashboard

Interpreting Results

Good Performance

Degraded Performance

System Under Stress

Performance Optimization Tips

1. Cache Optimization

2. Database Optimization

3. Kafka Optimization

4. API Optimization

Troubleshooting

High Latency

High Error Rate

Low Throughput

References