Skip to content
Merged
178 changes: 178 additions & 0 deletions deploy/helm/codex-lb/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -245,6 +245,184 @@ total_connections = (databasePoolSize + databaseMaxOverflow) × replicas

Keep this within your PostgreSQL `max_connections` budget or place PgBouncer in front of the database.

## Production Deployment

Multi-replica production deployments require careful coordination of database connectivity, session routing, and graceful shutdown. This section covers the key patterns and tuning parameters.

### Prerequisites for Multi-Replica

Single-replica deployments can use SQLite, but **multi-replica requires PostgreSQL**:

- **Database**: PostgreSQL is mandatory for multi-replica because:
- SQLite does not support concurrent writes from multiple pods
- Leader election requires a shared database backend
- Session bridge ring membership is stored in the database

- **Leader Election**: Enabled by default (`config.leaderElectionEnabled=true`)
- Ensures only one pod performs background tasks (e.g., session cleanup, metrics aggregation)
- Uses database-backed locking with a TTL (`config.leaderElectionTtlSeconds=30`)
- If the leader crashes, another pod acquires the lock within 30 seconds

- **Circuit Breaker**: Enabled by default (`config.circuitBreakerEnabled=true`)
- Protects upstream API endpoints from cascading failures
- Opens after `config.circuitBreakerFailureThreshold=5` consecutive failures
- Enters half-open state after `config.circuitBreakerRecoveryTimeoutSeconds=60` seconds
- Prevents thundering herd when upstream is degraded

### Session Bridge Ring

The session bridge is an in-memory cache of upstream WebSocket connections, shared across the pod ring.

**Automatic Ring Membership (PostgreSQL)**

When using PostgreSQL, ring membership is **automatic and database-backed**:

- Each pod registers itself in the database on startup
- The `sessionBridgeInstanceRing` field is **optional** and only needed for manual pod list override
- Pods discover each other via database queries; no manual configuration required
- Ring membership is cleaned up automatically when pods terminate

**Manual Ring Override (Advanced)**

If you need to manually specify the pod ring (e.g., for testing or debugging):

```yaml
config:
sessionBridgeInstanceRing: "codex-lb-0.codex-lb.default.svc.cluster.local,codex-lb-1.codex-lb.default.svc.cluster.local"
```

This is rarely needed in production; the database-backed discovery is preferred.

### Connection Pool Budget

Each pod maintains its own SQLAlchemy connection pool. The total connections across all replicas must fit within PostgreSQL's `max_connections`:

```
(databasePoolSize + databaseMaxOverflow) × maxReplicas ≤ PostgreSQL max_connections
```

**Example for `values-prod.yaml`:**

```yaml
config:
databasePoolSize: 3
databaseMaxOverflow: 2
autoscaling:
maxReplicas: 20
```

Calculation: `(3 + 2) × 20 = 100` connections, which fits within PostgreSQL's default `max_connections=100`.

**Tuning:**

- Increase `databasePoolSize` if pods frequently wait for connections
- Increase `databaseMaxOverflow` for temporary spikes, but keep it small (overflow is slower)
- Reduce `maxReplicas` if you cannot increase PostgreSQL's `max_connections`
- Use PgBouncer or pgcat as a connection pooler in front of PostgreSQL if needed

### values-prod.yaml Reference

The `values-prod.yaml` overlay is pre-configured for production multi-replica deployments:

```yaml
replicaCount: 3 # Start with 3 replicas
postgresql:
enabled: false # Use external PostgreSQL
autoscaling:
enabled: true
minReplicas: 3
maxReplicas: 20
behavior:
scaleDown:
stabilizationWindowSeconds: 600 # 10 min cooldown (see below)
affinity:
podAntiAffinity: hard # Spread pods across nodes
topologySpreadConstraints:
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone # Spread across zones
networkPolicy:
enabled: true # Restrict ingress/egress
metrics:
serviceMonitor:
enabled: true # Prometheus scraping
prometheusRule:
enabled: true # Alerting rules
grafanaDashboard:
enabled: true # Pre-built dashboards
externalSecrets:
enabled: true # Use External Secrets Operator
```

Install with:

```bash
helm install codex-lb oci://ghcr.io/soju06/charts/codex-lb \
-f deploy/helm/codex-lb/values-prod.yaml \
--set externalDatabase.url='postgresql+asyncpg://user:pass@db.example.com:5432/codexlb'
```

### Graceful Shutdown Tuning

Graceful shutdown coordinates three timeout parameters to drain in-flight requests and session bridge connections:

```
preStopSleepSeconds (15s) → shutdownDrainTimeoutSeconds (30s) → terminationGracePeriodSeconds (60s)
```

**Timeline:**

1. **preStopSleepSeconds (15s)**: Pod receives SIGTERM
- Sleep briefly to allow load balancer to remove the pod from rotation
- Prevents new requests from arriving during shutdown

2. **shutdownDrainTimeoutSeconds (30s)**: Drain in-flight requests
- HTTP server stops accepting new connections
- Existing requests are allowed to complete (up to 30 seconds)
- Session bridge connections are gracefully closed

3. **terminationGracePeriodSeconds (60s)**: Hard deadline
- Total time from SIGTERM to SIGKILL
- Must be ≥ `preStopSleepSeconds + shutdownDrainTimeoutSeconds`
- Default 60s allows 15s + 30s + 15s buffer

**Tuning:**

- Increase `preStopSleepSeconds` if your load balancer takes longer to deregister
- Increase `shutdownDrainTimeoutSeconds` if requests typically take >30s to complete
- Increase `terminationGracePeriodSeconds` proportionally (must be larger than the sum)
- Keep the buffer small; long shutdown times delay pod replacement

Example for long-running requests:

```yaml
preStopSleepSeconds: 20
shutdownDrainTimeoutSeconds: 60
terminationGracePeriodSeconds: 90
```

### Scale-Down Caution

The `stabilizationWindowSeconds: 600` (10 minutes) in `values-prod.yaml` is intentionally high.

**Why?**

- Session bridge connections have idle TTLs (`sessionBridgeIdleTtlSeconds=120` for API, `sessionBridgeCodexIdleTtlSeconds=900` for Codex)
- When a pod scales down, its in-memory sessions are lost
- Clients reconnecting to a different pod must re-establish upstream connections
- A 10-minute cooldown prevents rapid scale-down/up cycles that would thrash session state

**Behavior:**

- HPA will scale down at most 1 pod every 2 minutes (when cooldown is active)
- If load drops suddenly, scale-down is delayed by up to 10 minutes
- This trades off faster scale-down for session stability

**Tuning:**

- Reduce `stabilizationWindowSeconds` if you prioritize cost over session stability
- Increase it if you see frequent session reconnections during scale events
- Monitor `sessionBridgeInstanceRing` size changes in logs to detect scale-down impact

## Security

The chart targets the Kubernetes Restricted Pod Security Standard.
Expand Down
19 changes: 19 additions & 0 deletions deploy/helm/codex-lb/templates/_helpers.tpl
Original file line number Diff line number Diff line change
Expand Up @@ -139,3 +139,22 @@ Image string — resolves registry/repository:tag with optional digest override
{{- printf "%s/%s:%s" $registry $repository $tag }}
{{- end }}
{{- end }}

{{/*
Merged nodeSelector: global.nodeSelector + local nodeSelector (local wins).
*/}}
{{- define "codex-lb.nodeSelector" -}}
{{- $merged := mustMergeOverwrite (deepCopy (.Values.global.nodeSelector | default dict)) (.Values.nodeSelector | default dict) -}}
{{- if $merged }}
{{- toYaml $merged }}
{{- end }}
{{- end -}}

{{/*
Global-only nodeSelector for hooks/tests so app-specific placement does not block installs.
*/}}
{{- define "codex-lb.globalNodeSelector" -}}
{{- with (.Values.global.nodeSelector | default dict) }}
{{- toYaml . }}
{{- end }}
{{- end -}}
4 changes: 2 additions & 2 deletions deploy/helm/codex-lb/templates/deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -89,9 +89,9 @@ spec:
topologySpreadConstraints:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.nodeSelector }}
{{- with (include "codex-lb.nodeSelector" .) }}
nodeSelector:
{{- toYaml . | nindent 8 }}
{{- . | nindent 8 }}
{{- end }}
{{- with .Values.tolerations }}
tolerations:
Expand Down
91 changes: 91 additions & 0 deletions deploy/helm/codex-lb/templates/hooks/db-init-job.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
{{- if and .Values.dbInit.enabled (not .Values.postgresql.enabled) }}
apiVersion: batch/v1
kind: Job
metadata:
name: {{ printf "%s-db-init" (include "codex-lb.fullname" . | trunc 52 | trimSuffix "-") }}
namespace: {{ .Release.Namespace | quote }}
labels:
{{- include "codex-lb.labels" . | nindent 4 }}
annotations:
"helm.sh/hook": pre-install
"helm.sh/hook-weight": "-10"
"helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
spec:
template:
spec:
restartPolicy: OnFailure
automountServiceAccountToken: false
{{- with .Values.podSecurityContext }}
securityContext:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- $pullSecrets := concat (.Values.global.imagePullSecrets | default list) (.Values.image.pullSecrets | default list) }}
{{- if $pullSecrets }}
imagePullSecrets:
{{- range $pullSecrets }}
- name: {{ . }}
{{- end }}
{{- end }}
{{- with (include "codex-lb.globalNodeSelector" .) }}
nodeSelector:
{{- . | nindent 8 }}
{{- end }}
containers:
- name: db-init
image: {{ printf "%s/bitnami/postgresql:16" (.Values.global.imageRegistry | default "docker.io") }}
imagePullPolicy: IfNotPresent
{{- with .Values.containerSecurityContext }}
securityContext:
{{- toYaml . | nindent 12 }}
{{- end }}
command: ["sh", "-ec"]
args:
- |
PGPASSWORD="$ADMIN_PASSWORD" psql \
-h "$DB_HOST" -p "$DB_PORT" -U "$ADMIN_USER" -d postgres <<'SQL'
{{- range .Values.dbInit.databases }}
{{- $dbTag := printf "db_%s" ((printf "%s" .name) | sha256sum | trunc 12) }}
{{- $userTag := printf "user_%s" ((printf "%s" .user) | sha256sum | trunc 12) }}
{{- $passTag := printf "pass_%s" ((printf "%s" .password) | sha256sum | trunc 12) }}
DO $$ BEGIN
IF NOT EXISTS (SELECT FROM pg_roles WHERE rolname = ${{ $userTag }}${{ .user }}${{ $userTag }}$) THEN
EXECUTE format(
'CREATE ROLE %I WITH LOGIN PASSWORD %L',
${{ $userTag }}${{ .user }}${{ $userTag }}$,
${{ $passTag }}${{ .password }}${{ $passTag }}$
);
END IF;
END $$;
SELECT format(
'CREATE DATABASE %I OWNER %I',
${{ $dbTag }}${{ .name }}${{ $dbTag }}$,
${{ $userTag }}${{ .user }}${{ $userTag }}$
)
WHERE NOT EXISTS (
SELECT FROM pg_database WHERE datname = ${{ $dbTag }}${{ .name }}${{ $dbTag }}$
)\gexec
SELECT format(
'GRANT ALL PRIVILEGES ON DATABASE %I TO %I',
${{ $dbTag }}${{ .name }}${{ $dbTag }}$,
${{ $userTag }}${{ .user }}${{ $userTag }}$
)\gexec
{{- end }}
SQL
env:
- name: DB_HOST
value: {{ .Values.dbInit.host | quote }}
- name: DB_PORT
value: {{ .Values.dbInit.port | default "5432" | quote }}
- name: ADMIN_USER
value: {{ .Values.dbInit.adminUser | quote }}
- name: ADMIN_PASSWORD
{{- if .Values.dbInit.adminPasswordSecret }}
valueFrom:
secretKeyRef:
name: {{ .Values.dbInit.adminPasswordSecret.name }}
key: {{ .Values.dbInit.adminPasswordSecret.key }}
{{- else }}
value: {{ .Values.dbInit.adminPassword | quote }}
{{- end }}
backoffLimit: 3
{{- end }}
4 changes: 4 additions & 0 deletions deploy/helm/codex-lb/templates/hooks/migration-job.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,10 @@ spec:
- name: {{ . }}
{{- end }}
{{- end }}
{{- with (include "codex-lb.globalNodeSelector" .) }}
nodeSelector:
{{- . | nindent 8 }}
{{- end }}
{{- if .Values.postgresql.enabled }}
initContainers:
- name: wait-for-db
Expand Down
2 changes: 1 addition & 1 deletion deploy/helm/codex-lb/templates/httproute.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -20,5 +20,5 @@ spec:
rules:
- backendRefs:
- name: {{ include "codex-lb.fullname" . }}
port: 2455
port: {{ .Values.service.port }}
{{- end }}
16 changes: 13 additions & 3 deletions deploy/helm/codex-lb/templates/service.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,16 +5,26 @@ metadata:
namespace: {{ .Release.Namespace | quote }}
labels:
{{- include "codex-lb.labels" . | nindent 4 }}
{{- with .Values.commonAnnotations }}
{{- with mustMerge (.Values.service.annotations | default dict) (.Values.commonAnnotations | default dict) }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
spec:
type: ClusterIP
type: {{ .Values.service.type }}
{{- if and (eq .Values.service.type "LoadBalancer") .Values.service.loadBalancerIP }}
loadBalancerIP: {{ .Values.service.loadBalancerIP }}
{{- end }}
{{- if and (eq .Values.service.type "LoadBalancer") .Values.service.loadBalancerSourceRanges }}
loadBalancerSourceRanges:
{{- toYaml .Values.service.loadBalancerSourceRanges | nindent 4 }}
{{- end }}
selector:
{{- include "codex-lb.selectorLabels" . | nindent 4 }}
ports:
- name: http
port: 2455
port: {{ .Values.service.port }}
targetPort: http
protocol: TCP
{{- if and (eq .Values.service.type "NodePort") .Values.service.nodePort }}
nodePort: {{ .Values.service.nodePort }}
{{- end }}
15 changes: 12 additions & 3 deletions deploy/helm/codex-lb/templates/tests/test-connection.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,12 @@ spec:
runAsUser: 1000
seccompProfile:
type: RuntimeDefault
{{- with (include "codex-lb.globalNodeSelector" .) }}
nodeSelector:
{{- . | nindent 4 }}
{{- end }}
containers:
- name: test-connection
- name: test-health
image: busybox:1.37
imagePullPolicy: IfNotPresent
securityContext:
Expand All @@ -29,5 +33,10 @@ spec:
- sh
- -c
- |
wget --spider --timeout=10 http://{{ include "codex-lb.fullname" . }}:2455/health || exit 1
echo "Connection test passed!"
echo "=== Health endpoint ==="
wget --spider --timeout=10 http://{{ include "codex-lb.fullname" . }}:{{ .Values.service.port | default 2455 }}/health || exit 1
echo "Health check passed!"

echo "=== Startup probe ==="
wget -qO- --timeout=10 http://{{ include "codex-lb.fullname" . }}:{{ .Values.service.port | default 2455 }}/health/ready || exit 1
echo "Readiness check passed!"
Loading
Loading