Skip to content

evilvic/codex-proxy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Codex CLI Proxy

A lightweight HTTP proxy for OpenAI Codex CLI deployed on AWS that gives your organization full observability, per-developer usage tracking, cost attribution, and access control over Codex usage.

Background

We ran a pilot of Codex CLI with a small group of developers. The results were very good -- faster iteration, higher output, measurable productivity gains. So we decided to roll it out to the entire engineering team.

In the initial setup, each developer received their own individual OpenAI API key. This gave us basic per-person usage metrics through OpenAI's dashboard -- we could see spend and model usage broken down by key. However, this approach had significant gaps:

  • No content visibility -- we could see how much each person used, but not what they were sending. There was no way to audit prompts or responses.
  • No centralized logging -- request/response data lived entirely on OpenAI's side, inaccessible for internal analysis or compliance.
  • No secret detection -- developers could accidentally include API keys, credentials, or connection strings in their prompts, and we had no way to know or prevent it.
  • Fragmented key management -- each developer holding their own key meant N keys to track, rotate, and revoke individually. Offboarding required remembering to deactivate the right key in OpenAI's dashboard.

We needed to centralize control while keeping the developer experience identical:

  1. Full request/response logging -- every prompt and response stored for auditing, cost analysis, and compliance
  2. Per-developer cost attribution -- which models are being used, how many tokens per person, estimated cost per developer
  3. Centralized access control -- enable/disable access per person instantly from a single place (onboarding/offboarding)
  4. A foundation for secret detection -- with all traffic flowing through our infrastructure, we can inspect and sanitize prompts before they reach OpenAI
  5. Transparency -- developers should notice zero difference in their Codex CLI experience (skills, MCPs, images, streaming all work)

This proxy solves all of that. It replaces the per-developer OpenAI keys with a single organization key stored securely in AWS, and gives each developer a proxy token instead. All traffic flows through the proxy, giving us full observability without affecting the developer experience.

Next step: Secret detection

The proxy already logs full request bodies to S3. The next planned step is a pre-flight sanitization agent that inspects prompts before they reach OpenAI and redacts or blocks requests containing secrets (API keys, credentials, connection strings, private tokens). This is critical for enterprise adoption -- as a company, we need to ensure developers aren't accidentally sending sensitive material to external APIs. The S3 log archive provides the foundation: a Lambda or Glue job can scan stored prompts for secret patterns (regex + entropy analysis), flag violations, and feed into the sanitization layer. The proxy architecture already supports inserting this step before the forwarding call -- it's a matter of implementing the detection logic and wiring it in.


How It Works

Developer's machine              AWS                        OpenAI
┌──────────────┐      ┌─────────────────────┐      ┌──────────────┐
│  Codex CLI   │─────>│  Lambda Function URL │─────>│  OpenAI API  │
│              │      │  ┌───────────────┐   │      │              │
│ OPENAI_BASE  │      │  │  Lambda (Go)  │   │      │ api.openai   │
│ _URL = proxy │      │  │               │   │      │ .com         │
│              │      │  │ 1. Validate   │   │      │              │
│ OPENAI_API   │      │  │    token      │   │      │              │
│ _KEY = token │      │  │ 2. Log to S3  │   │      │              │
│              │<─────│  │ 3. Forward    │   │      │              │
│              │      │  │ 4. Return     │   │      │              │
└──────────────┘      │  └───────────────┘   │      └──────────────┘
                      │         │            │
                      │  ┌──────┴──────┐     │
                      │  │  DynamoDB   │     │
                      │  │  (tokens)   │     │
                      │  ├─────────────┤     │
                      │  │  S3 (logs)  │     │
                      │  ├─────────────┤     │
                      │  │  Secrets    │     │
                      │  │  Manager    │     │
                      │  └─────────────┘     │
                      └─────────────────────┘
sequenceDiagram
    participant CLI as Codex CLI
    participant FnURL as Function URL
    participant Lambda as Lambda (Go)
    participant DDB as DynamoDB
    participant S3 as S3 Logs
    participant OpenAI as OpenAI API

    CLI->>FnURL: Request + Bearer <proxy_token>
    FnURL->>Lambda: Invoke (up to 120s timeout)
    Lambda->>DDB: Validate token → user_id, enabled
    DDB-->>Lambda: OK
    Lambda->>OpenAI: Forward request + real OpenAI API key
    OpenAI-->>Lambda: Response
    Lambda->>S3: Async write log (request + response)
    Lambda-->>FnURL: Response (unchanged)
    FnURL-->>CLI: Response
Loading

The proxy is fully transparent. The response from OpenAI reaches the developer unchanged. Skills, MCPs, image uploads, and streaming all work exactly as if the developer were calling OpenAI directly.


AWS Resources Created

Resource Name Purpose
Lambda (Go) codex-proxy Core proxy: validates token, forwards to OpenAI, logs to S3
Lambda Function URL codex-proxy:live Public HTTPS endpoint (native Lambda, no extra cost)
DynamoDB codex-proxy-tokens Token registry: tokenuser_id, enabled, created_at
S3 codex-proxy-logs-<account-id> Full request/response logs, partitioned by year/month/day/token/
Secrets Manager codex-proxy/openai-api-key The real OpenAI API key (never exposed to developers)
IAM Role codex-proxy-lambda-role Least-privilege permissions for the Lambda
CloudWatch /aws/lambda/codex-proxy Lambda execution logs + optional alarms

Prerequisites

  • AWS CLI configured with permissions to create Lambda, DynamoDB, S3, IAM, Secrets Manager
  • Go 1.21+ (to compile the Lambda binary)
  • An OpenAI API key with access to the models you want developers to use

Deployment (Step by Step)

Step 1: Store the OpenAI API key in Secrets Manager

aws secretsmanager create-secret \
  --region us-east-1 \
  --name codex-proxy/openai-api-key \
  --secret-string "sk-proj-..."

Step 2: Deploy the proxy

chmod +x scripts/*.sh
./scripts/deploy.sh

What deploy.sh does:

  1. Compiles the Lambda (Go → Linux/amd64 binary)
  2. Creates the S3 bucket for logs (90-day lifecycle)
  3. Creates the DynamoDB table for tokens
  4. Creates the IAM role with least-privilege policy
  5. Creates or updates the Lambda function (512 MB, 120s timeout)
  6. Publishes a version, creates alias live with provisioned concurrency (2)
  7. Creates a Lambda Function URL with CORS enabled
  8. Outputs the proxy URL (this is your OPENAI_BASE_URL)

Options:

./scripts/deploy.sh [logs-bucket-name] [openai-secret-name]
# Defaults:
#   Bucket: codex-proxy-logs-<account-id>
#   Secret: codex-proxy/openai-api-key
#   Region: $AWS_REGION or us-east-1

Why Function URL? Lambda Function URLs are a native HTTPS endpoint with no extra cost and no intermediate timeout limits. The request runs for up to the Lambda timeout (120s), which is enough for long OpenAI responses.

Step 3: Register developer tokens

Edit the DEVELOPERS list in scripts/register-tokens.py with your team:

DEVELOPERS = [
    (1, "Alice Smith", "alice@example.com"),
    (2, "Bob Johnson", "bob@example.com"),
    (3, "Carol Williams", "carol@example.com"),
    # ...
]

Then run:

# Preview only (generates tokens, prints table, no AWS calls):
python3 scripts/register-tokens.py --dry-run

# Register all tokens in DynamoDB:
python3 scripts/register-tokens.py

This creates a codex-tokens.json file (gitignored) with the token mappings and registers each one in DynamoDB.

To register a single token manually:

aws dynamodb put-item --region us-east-1 --table-name codex-proxy-tokens --item '{
  "token": {"S": "codexproxy_a1b2c3d4e5f6..."},
  "user_id": {"S": "developer@example.com"},
  "enabled": {"BOOL": true},
  "created_at": {"S": "2026-01-15T00:00:00Z"}
}'

Step 4: Configure developer machines

First, update the PROXY_URL variable in scripts/setup.sh with your actual Function URL from Step 2.

Then give each developer their token and have them run:

Linux / macOS:

# If you host the script (see "CDN Distribution" below):
curl -sL https://your-domain.com/codex/setup.sh | bash

# Or directly from the repo:
bash scripts/setup.sh

Windows (PowerShell):

powershell -ExecutionPolicy Bypass -File scripts\setup-windows.ps1

What the setup script does:

  1. Asks the developer to paste their personal proxy token
  2. Writes ~/.codex/config.toml with the proxy as the model provider
  3. Sets OPENAI_BASE_URL and OPENAI_API_KEY (which holds the proxy token, not an OpenAI key) in the shell profile
  4. Exports both variables for the current session

Manual setup (any OS):

Create ~/.codex/config.toml (or %USERPROFILE%\.codex\config.toml on Windows):

model_provider = "codexproxy"
forced_login_method = "api"

[model_providers.codexproxy]
name = "Codex Proxy"
base_url = "https://YOUR_FUNCTION_URL.lambda-url.REGION.on.aws/v1"
env_key = "OPENAI_API_KEY"  # Variable name Codex CLI reads; the value is your proxy token, not an OpenAI key

Set environment variables:

export OPENAI_BASE_URL="https://YOUR_FUNCTION_URL.lambda-url.REGION.on.aws"
export OPENAI_API_KEY="your-proxy-token"  # This is the proxy token, not an OpenAI API key

Step 5: Validate

./scripts/validate.sh

Checks all resources: Lambda invocations, CloudWatch logs, DynamoDB tokens, S3 logs, Secrets Manager.


Usage Reports & Cost Attribution

Generate per-developer usage reports with token counts and estimated OpenAI costs:

# Today's usage
python3 scripts/usage-report.py

# A specific day
python3 scripts/usage-report.py 2026-02-16

# Date range (inclusive)
python3 scripts/usage-report.py 2026-02-01 2026-02-28

Sample output:

Codex Proxy Usage | 2026-02-16 to 2026-02-16 | bucket codex-proxy-logs-123456789012
Logs processed: 342 | Errors: 0

user_id                             requests  input_tok   cached output_tok  total_tok   cost_USD  models
-------------------------------------------------------------------------------------------------------------------
alice@example.com                         47     285400   142200      38200     323600     0.4821  gpt-5.3-codex:47
bob@example.com                           31     198300    95100      27100     225400     0.3412  gpt-5.3-codex:31
carol@example.com                         22      94200    41000      15800     110000     0.1580  gpt-5.1-codex-mini:22

A CSV file is also generated for spreadsheet analysis.

Model pricing is configured in PRICING_PER_1M at the top of the script -- update it if OpenAI changes prices.


Token Management

Disable a token (offboarding, policy violation, etc.):

aws dynamodb update-item --region us-east-1 --table-name codex-proxy-tokens \
  --key '{"token": {"S": "the-token"}}' \
  --update-expression "SET #e = :v" \
  --expression-attribute-names '{"#e": "enabled"}' \
  --expression-attribute-values '{":v": {"BOOL": false}}'

Re-enable a token:

aws dynamodb update-item --region us-east-1 --table-name codex-proxy-tokens \
  --key '{"token": {"S": "the-token"}}' \
  --update-expression "SET #e = :v" \
  --expression-attribute-names '{"#e": "enabled"}' \
  --expression-attribute-values '{":v": {"BOOL": true}}'

The developer gets a 401 immediately on next request -- no key rotation needed.


Monitoring & Alerts

Create CloudWatch alarms for the proxy:

# Create alarms (Lambda Throttles >= 1, Lambda Errors >= 3 in 5 min):
./scripts/create-alarms.sh

# With SNS notification (Slack, email, PagerDuty, etc.):
SNS_TOPIC_ARN="arn:aws:sns:us-east-1:ACCOUNT_ID:my-alerts" ./scripts/create-alarms.sh

Diagnosing Errors

When a developer reports an error (e.g., "high demand", "temporary errors"), determine if the error comes from OpenAI or the proxy:

./scripts/check-response-origin.sh "high demand"

The script searches the S3 logs for that text in the response_body. If found, the error comes from OpenAI (upstream), not the proxy. See docs/troubleshooting.md for the full diagnostic flow.


Optional: CDN Distribution for Setup Script

Host the setup script behind CloudFront with a custom domain so developers can onboard with a single command:

# 1. Publish setup.sh to S3 + CloudFront:
./scripts/deploy-setup-cdn.sh

# 2. (Optional) Add a custom domain:
./scripts/add-domain-cdn.sh codex.yourcompany.com

# 3. Developers run:
curl -sL https://codex.yourcompany.com/codex/setup.sh | bash

Cost Estimate (AWS Infrastructure Only)

This is only the AWS infrastructure cost. OpenAI API usage is billed separately by OpenAI.

Scenario Requests/month Approx. cost
Low ~20,000 (~1k/dev) $2-4/month
Moderate ~60,000 (~3k/dev) $6-10/month
High ~150,000 (~7.5k/dev) $14-22/month

Based on ~20 developers in us-east-1. Provisioned concurrency adds ~$5-7/month. See docs/cost-estimation.md for the per-service breakdown.


Project Structure

codex-proxy/
├── lambda/                          # Proxy Lambda source (Go)
│   ├── main.go                      #   Token validation, forwarding, S3 logging
│   ├── go.mod
│   └── go.sum
├── scripts/
│   ├── deploy.sh                    # Full deployment (AWS CLI, no SAM required)
│   ├── setup.sh                     # Developer onboarding (Linux/macOS)
│   ├── setup-windows.ps1            # Developer onboarding (Windows)
│   ├── validate.sh                  # Health check across all resources
│   ├── create-alarms.sh             # CloudWatch alarms (throttles, errors)
│   ├── usage-report.py              # Per-developer usage + cost report → CSV
│   ├── register-tokens.py           # Bulk token generation + DynamoDB registration
│   ├── check-response-origin.sh     # Diagnose error origin (OpenAI vs proxy)
│   ├── use-function-url.sh          # Migrate developer config to Function URL
│   ├── deploy-setup-cdn.sh          # Publish setup.sh via S3 + CloudFront
│   └── add-domain-cdn.sh            # Add custom domain to CloudFront distribution
├── docs/
│   ├── architecture.md              # Design decisions, flow diagrams
│   ├── troubleshooting.md           # 503 diagnosis, Windows setup, common issues
│   └── cost-estimation.md           # Per-service cost breakdown
├── template.yaml                    # SAM template (alternative deployment method)
├── .gitignore
└── LICENSE

Further Reading

  • Architecture -- Why Go, why Function URL, resilience design
  • Troubleshooting -- 503 errors, token issues, Windows config, error origin diagnosis
  • Cost Estimation -- Detailed per-service cost breakdown by usage scenario

License

MIT

About

Lightweight AWS Lambda proxy for OpenAI Codex CLI — per-developer token auth, full request/response logging, cost attribution, and centralized access control.

Topics

Resources

License

Stars

Watchers

Forks

Contributors