Testing Guide

This document describes the testing infrastructure, test organization, and how to run tests for the Code Interpreter API.

Test Organization

Tests are organized into two main categories:

tests/
├── conftest.py              # Shared fixtures for all tests
├── unit/                    # Unit tests (no external dependencies)
│   ├── test_id_generator.py
│   ├── test_minio_config.py
│   ├── test_output_processor.py
│   ├── test_session_service.py
│   └── test_state_service.py
├── integration/             # Integration tests (require Docker, Redis, MinIO)
│   ├── test_api_contracts.py
│   ├── test_auth_integration.py
│   ├── test_container_behavior.py
│   ├── test_exec_api.py
│   ├── test_file_api.py
│   ├── test_file_handling.py
│   ├── test_librechat_compat.py
│   ├── test_security_integration.py
│   ├── test_session_behavior.py
│   └── test_state_api.py
└── snapshots/               # Snapshot data for tests

Unit Tests (`tests/unit/`)

Unit tests validate individual components in isolation:

Mock external dependencies (Kubernetes, Redis, MinIO)
Fast execution (~seconds)
No infrastructure required

Integration Tests (`tests/integration/`)

Integration tests validate end-to-end behavior:

Require running Kubernetes (or kind/k3s), Redis, MinIO
Test actual API endpoints
Validate LibreChat compatibility
Test pod behavior and cleanup

Running Tests

Prerequisites

Before running tests, ensure:

Dependencies installed:
```
just install
```
For integration tests, infrastructure running:
```
just docker-up
# Or: docker-compose up -d
```

Running All Tests

# Run all tests
just test

# With coverage report
just test-cov

Running Unit Tests Only

# Run all unit tests
just test-unit

# Run a specific test file
just test-file tests/unit/test_execution_service.py

# Run a specific test function
just test-file tests/unit/test_execution_service.py::test_execute_python_code

Running Integration Tests Only

# Run all integration tests
just test-integration

# Run specific integration test files
just test-file tests/integration/test_api_contracts.py

Key Test Files

API Contract Tests

File: tests/integration/test_api_contracts.py

Validates API request/response formats match expectations:

ExecRequest validation
ExecResponse structure
Error response formats
HTTP status codes

LibreChat Compatibility Tests

File: tests/integration/test_librechat_compat.py

Ensures compatibility with LibreChat's Code Interpreter API:

File upload format (multipart/form-data)
Session ID handling
File reference format
Response structure matching LibreChat expectations

Pod Behavior Tests

File: tests/integration/test_container_behavior.py

Tests pod lifecycle and execution:

Pod creation and cleanup
Resource limit enforcement
Timeout handling
Output capture

Session State Tests

File: tests/integration/test_session_state.py

Tests Python state persistence:

Variable persistence across executions
Function persistence
NumPy/Pandas object persistence
State size limits
Session isolation

File Handling Tests

File: tests/integration/test_file_handling.py

Tests file operations:

File upload
File download
File listing
File deletion
File naming edge cases

Writing Tests

Using Fixtures

Common fixtures are defined in tests/conftest.py:

import pytest

@pytest.fixture
def api_client():
    """HTTP client configured for API testing."""
    import httpx
    return httpx.AsyncClient(
        base_url="https://localhost",
        headers={"x-api-key": "test-api-key-for-development-only"},
        verify=False
    )

@pytest.fixture
def sample_python_code():
    """Sample Python code for testing."""
    return "print('Hello, World!')"

Async Tests

Use pytest.mark.asyncio for async tests:

import pytest

@pytest.mark.asyncio
async def test_execute_python(api_client):
    response = await api_client.post("/exec", json={
        "lang": "py",
        "code": "print(1+1)",
        "entity_id": "test",
        "user_id": "test"
    })
    assert response.status_code == 200
    data = response.json()
    assert data["stdout"] == "2\n"

Mocking External Services

For unit tests, mock external dependencies:

from unittest.mock import AsyncMock, patch

@pytest.mark.asyncio
async def test_execution_with_mocked_kubernetes():
    with patch("src.services.kubernetes.client.get_k8s_client") as mock_k8s:
        mock_pod = AsyncMock()
        mock_k8s.create_namespaced_pod.return_value = mock_pod

        # Test code here

Testing State Persistence

@pytest.mark.asyncio
async def test_state_persistence(api_client):
    # First execution - create variable
    response1 = await api_client.post("/exec", json={
        "lang": "py",
        "code": "x = 42",
        "entity_id": "test",
        "user_id": "test"
    })
    session_id = response1.json()["session_id"]

    # Second execution - use variable
    response2 = await api_client.post("/exec", json={
        "lang": "py",
        "code": "print(x)",
        "entity_id": "test",
        "user_id": "test",
        "session_id": session_id
    })
    assert response2.json()["stdout"] == "42\n"

Performance Testing

A dedicated performance testing script is available:

# Run performance tests
just perf-test

What Performance Tests Measure

Simple execution latency - Basic print statement
Complex execution latency - NumPy/Pandas operations
Concurrent request handling - Multiple simultaneous requests
State persistence overhead - Serialization/deserialization time
File operation latency - Upload/download speeds

Sample Output

=== Performance Test Results ===

Simple Python Execution:
  Mean: 32.5ms
  P50:  28.0ms
  P99:  85.0ms

Complex Python Execution:
  Mean: 125.0ms
  P50:  110.0ms
  P99:  250.0ms

Concurrent Requests (10x):
  Mean: 45.0ms
  Max:  180.0ms

Coverage Reports

Generate coverage reports:

# Generate HTML coverage report
just test-cov

# View report
open htmlcov/index.html

Coverage Targets

Component	Target	Current
src/api/	90%+	-
src/services/	85%+	-
src/middleware/	80%+	-
Overall	80%+	-

CI/CD Integration

For CI/CD pipelines, see .github/workflows/lint.yml for examples of using just commands in GitHub Actions.

GitHub Actions Example

- name: Install uv
  uses: astral-sh/setup-uv@v5

- name: Install just
  uses: taiki-e/install-action@just

- name: Run Tests
  run: just test-unit

Troubleshooting Tests

Integration Tests Failing

Check infrastructure:

kubectl get pods -n kubecoderun  # All pods should be "Running"

Check API health:
```
curl -sk https://localhost/health
```

Check logs:

kubectl logs -n kubecoderun deployment/kubecoderun

Async Test Issues

If async tests hang:

Ensure pytest-asyncio is installed
Check for unclosed async resources
Use @pytest.mark.asyncio decorator

Flaky Tests

For tests that occasionally fail:

Check for race conditions in pod cleanup
Ensure proper test isolation
Use explicit waits for async operations

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Testing Guide

Test Organization

Unit Tests (`tests/unit/`)

Integration Tests (`tests/integration/`)

Running Tests

Prerequisites

Running All Tests

Running Unit Tests Only

Running Integration Tests Only

Key Test Files

API Contract Tests

LibreChat Compatibility Tests

Pod Behavior Tests

Session State Tests

File Handling Tests

Writing Tests

Using Fixtures

Async Tests

Mocking External Services

Testing State Persistence

Performance Testing

What Performance Tests Measure

Sample Output

Coverage Reports

Coverage Targets

CI/CD Integration

GitHub Actions Example

Troubleshooting Tests

Integration Tests Failing

Async Test Issues

Flaky Tests

Related Documentation

FilesExpand file tree

TESTING.md

Latest commit

History

TESTING.md

File metadata and controls

Testing Guide

Test Organization

Unit Tests (tests/unit/)

Integration Tests (tests/integration/)

Running Tests

Prerequisites

Running All Tests

Running Unit Tests Only

Running Integration Tests Only

Key Test Files

API Contract Tests

LibreChat Compatibility Tests

Pod Behavior Tests

Session State Tests

File Handling Tests

Writing Tests

Using Fixtures

Async Tests

Mocking External Services

Testing State Persistence

Performance Testing

What Performance Tests Measure

Sample Output

Coverage Reports

Coverage Targets

CI/CD Integration

GitHub Actions Example

Troubleshooting Tests

Integration Tests Failing

Async Test Issues

Flaky Tests

Related Documentation

Unit Tests (`tests/unit/`)

Integration Tests (`tests/integration/`)