Skip to content

Latest commit

 

History

History
378 lines (267 loc) · 7.79 KB

File metadata and controls

378 lines (267 loc) · 7.79 KB

Testing Guide

This document describes the testing infrastructure, test organization, and how to run tests for the Code Interpreter API.

Test Organization

Tests are organized into two main categories:

tests/
├── conftest.py              # Shared fixtures for all tests
├── unit/                    # Unit tests (no external dependencies)
│   ├── test_id_generator.py
│   ├── test_minio_config.py
│   ├── test_output_processor.py
│   ├── test_session_service.py
│   └── test_state_service.py
├── integration/             # Integration tests (require Docker, Redis, MinIO)
│   ├── test_api_contracts.py
│   ├── test_auth_integration.py
│   ├── test_container_behavior.py
│   ├── test_exec_api.py
│   ├── test_file_api.py
│   ├── test_file_handling.py
│   ├── test_librechat_compat.py
│   ├── test_security_integration.py
│   ├── test_session_behavior.py
│   └── test_state_api.py
└── snapshots/               # Snapshot data for tests

Unit Tests (tests/unit/)

Unit tests validate individual components in isolation:

  • Mock external dependencies (Kubernetes, Redis, MinIO)
  • Fast execution (~seconds)
  • No infrastructure required

Integration Tests (tests/integration/)

Integration tests validate end-to-end behavior:

  • Require running Kubernetes (or kind/k3s), Redis, MinIO
  • Test actual API endpoints
  • Validate LibreChat compatibility
  • Test pod behavior and cleanup

Running Tests

Prerequisites

Before running tests, ensure:

  1. Dependencies installed:

    just install
  2. For integration tests, infrastructure running:

    just docker-up
    # Or: docker-compose up -d

Running All Tests

# Run all tests
just test

# With coverage report
just test-cov

Running Unit Tests Only

# Run all unit tests
just test-unit

# Run a specific test file
just test-file tests/unit/test_execution_service.py

# Run a specific test function
just test-file tests/unit/test_execution_service.py::test_execute_python_code

Running Integration Tests Only

# Run all integration tests
just test-integration

# Run specific integration test files
just test-file tests/integration/test_api_contracts.py

Key Test Files

API Contract Tests

File: tests/integration/test_api_contracts.py

Validates API request/response formats match expectations:

  • ExecRequest validation
  • ExecResponse structure
  • Error response formats
  • HTTP status codes

LibreChat Compatibility Tests

File: tests/integration/test_librechat_compat.py

Ensures compatibility with LibreChat's Code Interpreter API:

  • File upload format (multipart/form-data)
  • Session ID handling
  • File reference format
  • Response structure matching LibreChat expectations

Pod Behavior Tests

File: tests/integration/test_container_behavior.py

Tests pod lifecycle and execution:

  • Pod creation and cleanup
  • Resource limit enforcement
  • Timeout handling
  • Output capture

Session State Tests

File: tests/integration/test_session_state.py

Tests Python state persistence:

  • Variable persistence across executions
  • Function persistence
  • NumPy/Pandas object persistence
  • State size limits
  • Session isolation

File Handling Tests

File: tests/integration/test_file_handling.py

Tests file operations:

  • File upload
  • File download
  • File listing
  • File deletion
  • File naming edge cases

Writing Tests

Using Fixtures

Common fixtures are defined in tests/conftest.py:

import pytest

@pytest.fixture
def api_client():
    """HTTP client configured for API testing."""
    import httpx
    return httpx.AsyncClient(
        base_url="https://localhost",
        headers={"x-api-key": "test-api-key-for-development-only"},
        verify=False
    )

@pytest.fixture
def sample_python_code():
    """Sample Python code for testing."""
    return "print('Hello, World!')"

Async Tests

Use pytest.mark.asyncio for async tests:

import pytest

@pytest.mark.asyncio
async def test_execute_python(api_client):
    response = await api_client.post("/exec", json={
        "lang": "py",
        "code": "print(1+1)",
        "entity_id": "test",
        "user_id": "test"
    })
    assert response.status_code == 200
    data = response.json()
    assert data["stdout"] == "2\n"

Mocking External Services

For unit tests, mock external dependencies:

from unittest.mock import AsyncMock, patch

@pytest.mark.asyncio
async def test_execution_with_mocked_kubernetes():
    with patch("src.services.kubernetes.client.get_k8s_client") as mock_k8s:
        mock_pod = AsyncMock()
        mock_k8s.create_namespaced_pod.return_value = mock_pod

        # Test code here

Testing State Persistence

@pytest.mark.asyncio
async def test_state_persistence(api_client):
    # First execution - create variable
    response1 = await api_client.post("/exec", json={
        "lang": "py",
        "code": "x = 42",
        "entity_id": "test",
        "user_id": "test"
    })
    session_id = response1.json()["session_id"]

    # Second execution - use variable
    response2 = await api_client.post("/exec", json={
        "lang": "py",
        "code": "print(x)",
        "entity_id": "test",
        "user_id": "test",
        "session_id": session_id
    })
    assert response2.json()["stdout"] == "42\n"

Performance Testing

A dedicated performance testing script is available:

# Run performance tests
just perf-test

What Performance Tests Measure

  1. Simple execution latency - Basic print statement
  2. Complex execution latency - NumPy/Pandas operations
  3. Concurrent request handling - Multiple simultaneous requests
  4. State persistence overhead - Serialization/deserialization time
  5. File operation latency - Upload/download speeds

Sample Output

=== Performance Test Results ===

Simple Python Execution:
  Mean: 32.5ms
  P50:  28.0ms
  P99:  85.0ms

Complex Python Execution:
  Mean: 125.0ms
  P50:  110.0ms
  P99:  250.0ms

Concurrent Requests (10x):
  Mean: 45.0ms
  Max:  180.0ms

Coverage Reports

Generate coverage reports:

# Generate HTML coverage report
just test-cov

# View report
open htmlcov/index.html

Coverage Targets

Component Target Current
src/api/ 90%+ -
src/services/ 85%+ -
src/middleware/ 80%+ -
Overall 80%+ -

CI/CD Integration

For CI/CD pipelines, see .github/workflows/lint.yml for examples of using just commands in GitHub Actions.

GitHub Actions Example

- name: Install uv
  uses: astral-sh/setup-uv@v5

- name: Install just
  uses: taiki-e/install-action@just

- name: Run Tests
  run: just test-unit

Troubleshooting Tests

Integration Tests Failing

  1. Check infrastructure:

    kubectl get pods -n kubecoderun  # All pods should be "Running"
  2. Check API health:

    curl -sk https://localhost/health
  3. Check logs:

    kubectl logs -n kubecoderun deployment/kubecoderun

Async Test Issues

If async tests hang:

  • Ensure pytest-asyncio is installed
  • Check for unclosed async resources
  • Use @pytest.mark.asyncio decorator

Flaky Tests

For tests that occasionally fail:

  • Check for race conditions in pod cleanup
  • Ensure proper test isolation
  • Use explicit waits for async operations

Related Documentation