A lightweight HTTP proxy for OpenAI Codex CLI deployed on AWS that gives your organization full observability, per-developer usage tracking, cost attribution, and access control over Codex usage.
We ran a pilot of Codex CLI with a small group of developers. The results were very good -- faster iteration, higher output, measurable productivity gains. So we decided to roll it out to the entire engineering team.
In the initial setup, each developer received their own individual OpenAI API key. This gave us basic per-person usage metrics through OpenAI's dashboard -- we could see spend and model usage broken down by key. However, this approach had significant gaps:
- No content visibility -- we could see how much each person used, but not what they were sending. There was no way to audit prompts or responses.
- No centralized logging -- request/response data lived entirely on OpenAI's side, inaccessible for internal analysis or compliance.
- No secret detection -- developers could accidentally include API keys, credentials, or connection strings in their prompts, and we had no way to know or prevent it.
- Fragmented key management -- each developer holding their own key meant N keys to track, rotate, and revoke individually. Offboarding required remembering to deactivate the right key in OpenAI's dashboard.
We needed to centralize control while keeping the developer experience identical:
- Full request/response logging -- every prompt and response stored for auditing, cost analysis, and compliance
- Per-developer cost attribution -- which models are being used, how many tokens per person, estimated cost per developer
- Centralized access control -- enable/disable access per person instantly from a single place (onboarding/offboarding)
- A foundation for secret detection -- with all traffic flowing through our infrastructure, we can inspect and sanitize prompts before they reach OpenAI
- Transparency -- developers should notice zero difference in their Codex CLI experience (skills, MCPs, images, streaming all work)
This proxy solves all of that. It replaces the per-developer OpenAI keys with a single organization key stored securely in AWS, and gives each developer a proxy token instead. All traffic flows through the proxy, giving us full observability without affecting the developer experience.
The proxy already logs full request bodies to S3. The next planned step is a pre-flight sanitization agent that inspects prompts before they reach OpenAI and redacts or blocks requests containing secrets (API keys, credentials, connection strings, private tokens). This is critical for enterprise adoption -- as a company, we need to ensure developers aren't accidentally sending sensitive material to external APIs. The S3 log archive provides the foundation: a Lambda or Glue job can scan stored prompts for secret patterns (regex + entropy analysis), flag violations, and feed into the sanitization layer. The proxy architecture already supports inserting this step before the forwarding call -- it's a matter of implementing the detection logic and wiring it in.
Developer's machine AWS OpenAI
┌──────────────┐ ┌─────────────────────┐ ┌──────────────┐
│ Codex CLI │─────>│ Lambda Function URL │─────>│ OpenAI API │
│ │ │ ┌───────────────┐ │ │ │
│ OPENAI_BASE │ │ │ Lambda (Go) │ │ │ api.openai │
│ _URL = proxy │ │ │ │ │ │ .com │
│ │ │ │ 1. Validate │ │ │ │
│ OPENAI_API │ │ │ token │ │ │ │
│ _KEY = token │ │ │ 2. Log to S3 │ │ │ │
│ │<─────│ │ 3. Forward │ │ │ │
│ │ │ │ 4. Return │ │ │ │
└──────────────┘ │ └───────────────┘ │ └──────────────┘
│ │ │
│ ┌──────┴──────┐ │
│ │ DynamoDB │ │
│ │ (tokens) │ │
│ ├─────────────┤ │
│ │ S3 (logs) │ │
│ ├─────────────┤ │
│ │ Secrets │ │
│ │ Manager │ │
│ └─────────────┘ │
└─────────────────────┘
sequenceDiagram
participant CLI as Codex CLI
participant FnURL as Function URL
participant Lambda as Lambda (Go)
participant DDB as DynamoDB
participant S3 as S3 Logs
participant OpenAI as OpenAI API
CLI->>FnURL: Request + Bearer <proxy_token>
FnURL->>Lambda: Invoke (up to 120s timeout)
Lambda->>DDB: Validate token → user_id, enabled
DDB-->>Lambda: OK
Lambda->>OpenAI: Forward request + real OpenAI API key
OpenAI-->>Lambda: Response
Lambda->>S3: Async write log (request + response)
Lambda-->>FnURL: Response (unchanged)
FnURL-->>CLI: Response
The proxy is fully transparent. The response from OpenAI reaches the developer unchanged. Skills, MCPs, image uploads, and streaming all work exactly as if the developer were calling OpenAI directly.
| Resource | Name | Purpose |
|---|---|---|
| Lambda (Go) | codex-proxy |
Core proxy: validates token, forwards to OpenAI, logs to S3 |
| Lambda Function URL | codex-proxy:live |
Public HTTPS endpoint (native Lambda, no extra cost) |
| DynamoDB | codex-proxy-tokens |
Token registry: token → user_id, enabled, created_at |
| S3 | codex-proxy-logs-<account-id> |
Full request/response logs, partitioned by year/month/day/token/ |
| Secrets Manager | codex-proxy/openai-api-key |
The real OpenAI API key (never exposed to developers) |
| IAM Role | codex-proxy-lambda-role |
Least-privilege permissions for the Lambda |
| CloudWatch | /aws/lambda/codex-proxy |
Lambda execution logs + optional alarms |
- AWS CLI configured with permissions to create Lambda, DynamoDB, S3, IAM, Secrets Manager
- Go 1.21+ (to compile the Lambda binary)
- An OpenAI API key with access to the models you want developers to use
aws secretsmanager create-secret \
--region us-east-1 \
--name codex-proxy/openai-api-key \
--secret-string "sk-proj-..."chmod +x scripts/*.sh
./scripts/deploy.shWhat deploy.sh does:
- Compiles the Lambda (Go → Linux/amd64 binary)
- Creates the S3 bucket for logs (90-day lifecycle)
- Creates the DynamoDB table for tokens
- Creates the IAM role with least-privilege policy
- Creates or updates the Lambda function (512 MB, 120s timeout)
- Publishes a version, creates alias
livewith provisioned concurrency (2) - Creates a Lambda Function URL with CORS enabled
- Outputs the proxy URL (this is your
OPENAI_BASE_URL)
Options:
./scripts/deploy.sh [logs-bucket-name] [openai-secret-name]
# Defaults:
# Bucket: codex-proxy-logs-<account-id>
# Secret: codex-proxy/openai-api-key
# Region: $AWS_REGION or us-east-1Why Function URL? Lambda Function URLs are a native HTTPS endpoint with no extra cost and no intermediate timeout limits. The request runs for up to the Lambda timeout (120s), which is enough for long OpenAI responses.
Edit the DEVELOPERS list in scripts/register-tokens.py with your team:
DEVELOPERS = [
(1, "Alice Smith", "alice@example.com"),
(2, "Bob Johnson", "bob@example.com"),
(3, "Carol Williams", "carol@example.com"),
# ...
]Then run:
# Preview only (generates tokens, prints table, no AWS calls):
python3 scripts/register-tokens.py --dry-run
# Register all tokens in DynamoDB:
python3 scripts/register-tokens.pyThis creates a codex-tokens.json file (gitignored) with the token mappings and registers each one in DynamoDB.
To register a single token manually:
aws dynamodb put-item --region us-east-1 --table-name codex-proxy-tokens --item '{
"token": {"S": "codexproxy_a1b2c3d4e5f6..."},
"user_id": {"S": "developer@example.com"},
"enabled": {"BOOL": true},
"created_at": {"S": "2026-01-15T00:00:00Z"}
}'First, update the PROXY_URL variable in scripts/setup.sh with your actual Function URL from Step 2.
Then give each developer their token and have them run:
Linux / macOS:
# If you host the script (see "CDN Distribution" below):
curl -sL https://your-domain.com/codex/setup.sh | bash
# Or directly from the repo:
bash scripts/setup.shWindows (PowerShell):
powershell -ExecutionPolicy Bypass -File scripts\setup-windows.ps1What the setup script does:
- Asks the developer to paste their personal proxy token
- Writes
~/.codex/config.tomlwith the proxy as the model provider - Sets
OPENAI_BASE_URLandOPENAI_API_KEY(which holds the proxy token, not an OpenAI key) in the shell profile - Exports both variables for the current session
Manual setup (any OS):
Create ~/.codex/config.toml (or %USERPROFILE%\.codex\config.toml on Windows):
model_provider = "codexproxy"
forced_login_method = "api"
[model_providers.codexproxy]
name = "Codex Proxy"
base_url = "https://YOUR_FUNCTION_URL.lambda-url.REGION.on.aws/v1"
env_key = "OPENAI_API_KEY" # Variable name Codex CLI reads; the value is your proxy token, not an OpenAI keySet environment variables:
export OPENAI_BASE_URL="https://YOUR_FUNCTION_URL.lambda-url.REGION.on.aws"
export OPENAI_API_KEY="your-proxy-token" # This is the proxy token, not an OpenAI API key./scripts/validate.shChecks all resources: Lambda invocations, CloudWatch logs, DynamoDB tokens, S3 logs, Secrets Manager.
Generate per-developer usage reports with token counts and estimated OpenAI costs:
# Today's usage
python3 scripts/usage-report.py
# A specific day
python3 scripts/usage-report.py 2026-02-16
# Date range (inclusive)
python3 scripts/usage-report.py 2026-02-01 2026-02-28Sample output:
Codex Proxy Usage | 2026-02-16 to 2026-02-16 | bucket codex-proxy-logs-123456789012
Logs processed: 342 | Errors: 0
user_id requests input_tok cached output_tok total_tok cost_USD models
-------------------------------------------------------------------------------------------------------------------
alice@example.com 47 285400 142200 38200 323600 0.4821 gpt-5.3-codex:47
bob@example.com 31 198300 95100 27100 225400 0.3412 gpt-5.3-codex:31
carol@example.com 22 94200 41000 15800 110000 0.1580 gpt-5.1-codex-mini:22
A CSV file is also generated for spreadsheet analysis.
Model pricing is configured in PRICING_PER_1M at the top of the script -- update it if OpenAI changes prices.
Disable a token (offboarding, policy violation, etc.):
aws dynamodb update-item --region us-east-1 --table-name codex-proxy-tokens \
--key '{"token": {"S": "the-token"}}' \
--update-expression "SET #e = :v" \
--expression-attribute-names '{"#e": "enabled"}' \
--expression-attribute-values '{":v": {"BOOL": false}}'Re-enable a token:
aws dynamodb update-item --region us-east-1 --table-name codex-proxy-tokens \
--key '{"token": {"S": "the-token"}}' \
--update-expression "SET #e = :v" \
--expression-attribute-names '{"#e": "enabled"}' \
--expression-attribute-values '{":v": {"BOOL": true}}'The developer gets a 401 immediately on next request -- no key rotation needed.
Create CloudWatch alarms for the proxy:
# Create alarms (Lambda Throttles >= 1, Lambda Errors >= 3 in 5 min):
./scripts/create-alarms.sh
# With SNS notification (Slack, email, PagerDuty, etc.):
SNS_TOPIC_ARN="arn:aws:sns:us-east-1:ACCOUNT_ID:my-alerts" ./scripts/create-alarms.shWhen a developer reports an error (e.g., "high demand", "temporary errors"), determine if the error comes from OpenAI or the proxy:
./scripts/check-response-origin.sh "high demand"The script searches the S3 logs for that text in the response_body. If found, the error comes from OpenAI (upstream), not the proxy. See docs/troubleshooting.md for the full diagnostic flow.
Host the setup script behind CloudFront with a custom domain so developers can onboard with a single command:
# 1. Publish setup.sh to S3 + CloudFront:
./scripts/deploy-setup-cdn.sh
# 2. (Optional) Add a custom domain:
./scripts/add-domain-cdn.sh codex.yourcompany.com
# 3. Developers run:
curl -sL https://codex.yourcompany.com/codex/setup.sh | bashThis is only the AWS infrastructure cost. OpenAI API usage is billed separately by OpenAI.
| Scenario | Requests/month | Approx. cost |
|---|---|---|
| Low | ~20,000 (~1k/dev) | $2-4/month |
| Moderate | ~60,000 (~3k/dev) | $6-10/month |
| High | ~150,000 (~7.5k/dev) | $14-22/month |
Based on ~20 developers in us-east-1. Provisioned concurrency adds ~$5-7/month. See docs/cost-estimation.md for the per-service breakdown.
codex-proxy/
├── lambda/ # Proxy Lambda source (Go)
│ ├── main.go # Token validation, forwarding, S3 logging
│ ├── go.mod
│ └── go.sum
├── scripts/
│ ├── deploy.sh # Full deployment (AWS CLI, no SAM required)
│ ├── setup.sh # Developer onboarding (Linux/macOS)
│ ├── setup-windows.ps1 # Developer onboarding (Windows)
│ ├── validate.sh # Health check across all resources
│ ├── create-alarms.sh # CloudWatch alarms (throttles, errors)
│ ├── usage-report.py # Per-developer usage + cost report → CSV
│ ├── register-tokens.py # Bulk token generation + DynamoDB registration
│ ├── check-response-origin.sh # Diagnose error origin (OpenAI vs proxy)
│ ├── use-function-url.sh # Migrate developer config to Function URL
│ ├── deploy-setup-cdn.sh # Publish setup.sh via S3 + CloudFront
│ └── add-domain-cdn.sh # Add custom domain to CloudFront distribution
├── docs/
│ ├── architecture.md # Design decisions, flow diagrams
│ ├── troubleshooting.md # 503 diagnosis, Windows setup, common issues
│ └── cost-estimation.md # Per-service cost breakdown
├── template.yaml # SAM template (alternative deployment method)
├── .gitignore
└── LICENSE
- Architecture -- Why Go, why Function URL, resilience design
- Troubleshooting -- 503 errors, token issues, Windows config, error origin diagnosis
- Cost Estimation -- Detailed per-service cost breakdown by usage scenario
MIT