KubeCodeRun is a secure API for executing code in isolated Kubernetes pods. It uses a Kubernetes-native architecture with warm pod pools for low-latency execution and Jobs for cold-path languages.
┌─────────────────────────────────────────────────────────────────────────────┐
│ KubeCodeRun API │
│ (FastAPI Application) │
└─────────────────────────────────────────────────────────────────────────────┘
│
┌───────────────┼───────────────┐
▼ ▼ ▼
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Redis │ │ MinIO │ │ Kubernetes │
│ (Sessions) │ │ (Files) │ │ (Pods) │
└─────────────┘ └─────────────┘ └─────────────┘
│
┌───────────────┴───────────────┐
▼ ▼
┌─────────────────────┐ ┌─────────────────────┐
│ Pod Pool │ │ Job Executor │
│ (poolSize > 0) │ │ (poolSize = 0) │
│ Python, JS, etc. │ │ Go, Rust, etc. │
└─────────────────────┘ └─────────────────────┘
│ │
▼ ▼
┌─────────────────────────────────────────────┐
│ Execution Pods │
│ ┌─────────────────────────────────────┐ │
│ │ Single Container │ │
│ │ (Language Runtime + Runner Binary) │ │
│ └─────────────────────────────────────┘ │
└─────────────────────────────────────────────┘
| Strategy | Cold Start | Use Case |
|---|---|---|
| Warm Pod Pool | 50-100ms | Languages with pod_pool_<lang> > 0 |
| Kubernetes Jobs | 3-10s | Languages with pod_pool_<lang> = 0 |
The warm pool approach achieves ~85% reduction in P99 latency compared to cold-start execution.
Each execution pod runs a single container that includes both the language runtime and an embedded Go binary called "runner". The runner serves HTTP on port 8080 and executes code via subprocess. No sidecar, no shared process namespace, no nsenter, and no elevated privileges are required.
- Runs the language runtime (Python, Node.js, Go, etc.)
- Provides the execution environment (compilers, interpreters, libraries)
- Includes the runner binary at
/usr/local/bin/runner(copied via multi-stage build) - Uses
/mnt/dataas the working directory
- Lightweight Go HTTP server (~10MB static binary)
- Exposes REST API for code execution on port 8080
- Executes code via subprocess in the same container
- Handles file transfers and state management
- Single source of truth for language execution commands (
docker/runner/executor.go)
Runner API Endpoints:
POST /execute - Execute code with optional state
POST /files - Upload files to working directory
GET /files - List files in working directory
GET /files/{name} - Download file content
GET /health - Health check
The runner binary is embedded into each language container image via a multi-stage Docker build:
COPY --from=runner /runner /usr/local/bin/runner┌─────────────────────────────────────────────────────────────┐
│ Execution Pod │
│ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ Single Container │ │
│ │ │ │
│ │ • Language runtime (Python/Node/Go/...) │ │
│ │ • Runner binary serves HTTP on :8080 │ │
│ │ • Receives request → writes code to /mnt/data │ │
│ │ • Executes via subprocess → captures stdout/stderr │ │
│ │ • Returns results via HTTP response │ │
│ │ │ │
│ │ Working directory: /mnt/data │ │
│ └─────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
Execution flow:
- Runner receives an HTTP request with code to execute
- Writes code to a temporary file in
/mnt/data - Executes the appropriate language command via subprocess (e.g.,
python3 file.py) - Captures stdout/stderr and returns via HTTP
Advantages over the previous sidecar pattern:
- Zero elevated privileges: No
SYS_PTRACE,SYS_ADMIN,SYS_CHROOTcapabilities, noallowPrivilegeEscalation, noshareProcessNamespace - Correct resource limits: Container resource limits (CPU, memory) apply directly to user code since everything runs in a single cgroup
- Simpler pod spec: Single container, no shared volumes between containers
- Compatible with hardened runtimes: Works with any Kubernetes runtime (gVisor, Kata Containers, etc.)
TypeScript Note: TypeScript uses a two-step compilation (tsc file.ts && node file.js) instead of ts-node.
| Module | Purpose |
|---|---|
exec.py |
Code execution endpoints (POST /exec) |
files.py |
File upload/download endpoints |
health.py |
Health and readiness checks |
state.py |
Session state management |
admin.py |
Admin dashboard API |
dashboard_metrics.py |
Dashboard metrics endpoints |
| Service | Module | Responsibility |
|---|---|---|
| SessionService | session.py |
Session lifecycle (create, get, delete) |
| FileService | file.py |
File storage in MinIO |
| CodeExecutionRunner | execution/runner.py |
Primary code execution service |
| ExecutionOrchestrator | orchestrator.py |
Coordinates execution, state, and files |
| KubernetesManager | kubernetes/ |
Pod lifecycle and execution |
| StateService | state.py |
Python state persistence in Redis |
| HealthService | health.py |
Service health monitoring |
| Component | Module | Responsibility |
|---|---|---|
| KubernetesManager | manager.py |
Main entry point, coordinates pools and jobs |
| PodPoolManager | pool.py |
Warm pod pool management per language |
| JobExecutor | job_executor.py |
Job-based execution for cold languages |
| Client | client.py |
Kubernetes client factory |
1. Client Request
│
▼
2. API Endpoint (/exec)
│
▼
3. ExecutionOrchestrator
├── Validate request
├── Get/create session
├── Load state (Python only)
├── Mount files
│
▼
4. KubernetesManager.execute_code()
├── Hot path: Acquire pod from pool
│ └── PodPoolManager.acquire()
│
└── Cold path: Create Job
└── JobExecutor.execute()
│
▼
5. Runner Binary
├── POST /execute
├── Run code via subprocess
└── Return stdout/stderr/files
│
▼
6. Response Processing
├── Save state (Python only)
├── Store generated files
└── Destroy pod (pool replenishes)
│
▼
7. Client Response
Python sessions support state persistence across executions:
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Execute │ ──► │ Capture │ ──► │ Save │
│ Code │ │ State │ │ to Redis │
└─────────────┘ └─────────────┘ └─────────────┘
│
▼
cloudpickle + lz4
(compressed state)
State Flow:
- Before execution: Load state from Redis (or MinIO archive)
- Execute code with state restoration
- After execution: Capture and save new state
- Archive to MinIO after TTL expires
# Enable/disable pod pools
POD_POOL_ENABLED=true
POD_POOL_WARMUP_ON_STARTUP=true
# Per-language pool sizes (0 = use Jobs)
POD_POOL_PY=5 # Python: 5 warm pods
POD_POOL_JS=2 # JavaScript: 2 warm pods
POD_POOL_TS=0 # TypeScript: use Jobs
POD_POOL_GO=0 # Go: use Jobs
POD_POOL_JAVA=0 # Java: use Jobs
POD_POOL_RS=0 # Rust: use Jobs
POD_POOL_C=0 # C: use Jobs
POD_POOL_CPP=0 # C++: use Jobs
POD_POOL_PHP=0 # PHP: use Jobs
POD_POOL_R=0 # R: use Jobs
POD_POOL_F90=0 # Fortran: use Jobs
POD_POOL_D=0 # D: use Jobs
# Pool optimization settings
POD_POOL_PARALLEL_BATCH=5 # Pods to start in parallel during warmup
POD_POOL_REPLENISH_INTERVAL=2 # Seconds between pool replenishment checks
POD_POOL_EXHAUSTION_TRIGGER=true # Trigger immediate replenishment when exhaustedK8S_NAMESPACE=kubecoderun
K8S_IMAGE_REGISTRY=aronmuon/kubecoderun
K8S_IMAGE_TAG=latest
K8S_IMAGE_PULL_SECRETS=docker-privaterepo # Comma-separated secret names for private registries
K8S_CPU_LIMIT=1
K8S_MEMORY_LIMIT=512Mi
K8S_CPU_REQUEST=100m
K8S_MEMORY_REQUEST=128MiEach execution pod is isolated via:
- Network Policy: Deny all egress by default
- Security Context:
runAsNonRoot: truerunAsUser: 65532- Resource limits enforced
- Zero elevated privileges (no capabilities, no
allowPrivilegeEscalation)
- Ephemeral Storage: Pods destroyed after execution
- Non-root Execution: Container runs as UID 65532
- Hardened Runtime Compatible: No special capabilities required, works with gVisor, Kata Containers, etc.
The API deployment needs these Kubernetes permissions:
pods: create, delete, get, list, watchjobs: create, delete, get, list, watch
src/
├── api/ # FastAPI route handlers
│ ├── exec.py # Code execution endpoint
│ ├── files.py # File management
│ ├── health.py # Health checks
│ └── state.py # State management
│
├── config/ # Configuration
│ ├── __init__.py # Settings (Pydantic)
│ ├── languages.py # Language definitions
│ └── security.py # Security settings
│
├── models/ # Pydantic models
│ ├── execution.py # Execution models
│ ├── session.py # Session models
│ └── pool.py # Pool models
│
├── services/ # Business logic
│ ├── execution/ # Execution service
│ │ ├── runner.py # CodeExecutionRunner
│ │ └── output.py # Output processing
│ │
│ ├── kubernetes/ # Kubernetes integration
│ │ ├── manager.py # KubernetesManager
│ │ ├── pool.py # PodPoolManager
│ │ ├── job_executor.py
│ │ ├── client.py # K8s client factory
│ │ └── models.py # PodHandle, etc.
│ │
│ ├── session.py # SessionService
│ ├── file.py # FileService
│ ├── state.py # StateService
│ ├── health.py # HealthService
│ └── orchestrator.py # ExecutionOrchestrator
│
├── middleware/ # FastAPI middleware
│ ├── security.py # Security middleware
│ ├── auth.py # Authentication
│ ├── headers.py # Security headers
│ └── metrics.py # Request metrics
│
├── core/ # Core utilities
│ ├── events.py # Event handling
│ └── pool.py # Pool abstractions
│
└── main.py # Application entry point
docker/
├── api/ # API server container
│ └── Dockerfile
├── runner/ # Embedded runner binary (Go)
│ ├── main.go # HTTP server entry point
│ ├── executor.go # Language execution commands
│ ├── files.go # File handling
│ └── Dockerfile # Builds the runner binary
└── *.Dockerfile # Language execution environments
helm-deployments/
└── kubecoderun/ # Helm chart
├── templates/
│ ├── deployment.yaml
│ ├── service.yaml
│ ├── serviceaccount.yaml
│ ├── role.yaml
│ └── networkpolicy.yaml
└── values.yaml
| Endpoint | Purpose |
|---|---|
GET /health |
Basic liveness check |
GET /ready |
Readiness check |
GET /health/detailed |
Detailed health of all services |
GET /health/redis |
Redis connectivity |
GET /health/minio |
MinIO connectivity |
GET /health/kubernetes |
Kubernetes connectivity |
GET /metrics/pool |
Pod pool statistics |
The API exposes metrics for:
- Execution count by language and status
- Execution latency (P50, P95, P99)
- Pool hit/miss ratio
- Active sessions count
helm install kubecoderun ./helm-deployments/kubecoderun \
--namespace kubecoderun \
--create-namespace \
--set replicaCount=2 \
--set execution.languages.python.poolSize=5- Kubernetes 1.24+: Pod and Job execution
- Redis 6+: Session and state storage
- MinIO/S3: File storage and state archives