Toolkit Cost and Latency Optimizer

A lightweight, production-hardened CLI for analyzing LLM inference logs, validating schema compliance, and simulating routing policies.

This tool now also hosts the Cost Optimization Engine service (formerly a standalone tool) under services/cost-optimization-engine.

What It Does

Validate JSONL logs against the Toolkit inference event schema.
Summarize cost, latency, and success rates per model.
Recommend a default model based on SLO constraints.
Simulate tier routing policies over historical logs.

Install

pip install -e .

Quick Start

# Validate logs
toolkit-opt validate --input logs.jsonl

# Summarize per model
toolkit-opt summarize --input logs.jsonl

# Recommend a default model (based on p95, success rate, and sample count)
toolkit-opt recommend --input logs.jsonl --max-p95-ms 3000 --min-success 0.99 --min-samples 50

# Simulate a tier policy
toolkit-opt simulate --input logs.jsonl --policy policy.json

Use --verbose to enable DEBUG logging:

toolkit-opt --verbose summarize --input logs.jsonl

Input Log Schema (JSONL)

Each line in the log file must be a JSON object with the following required fields:

schema_version (int) must be 1
created_ts (number) timestamp (seconds)
model (string)
latency_ms (number, >= 0)
cost_usd (number, >= 0)
success (bool)

Optional fields supported by the tool:

tier (string) used by policy simulation
tokens_in (int, >= 0)
tokens_out (int, >= 0)

Example:

{"schema_version": 1, "created_ts": 1700000000.0, "model": "gpt-4", "latency_ms": 1200, "cost_usd": 0.045, "success": true, "tier": "premium"}

Policy File (JSON)

Policy files are JSON objects with a default model and optional tier overrides:

{
  "default_model": "gpt-3.5-turbo",
  "tiers": {
    "premium": "gpt-4",
    "fast": "gpt-3.5-turbo"
  }
}

Output

All commands emit JSON to stdout. This makes the tool easy to pipeline into other systems.

Exit Codes

0 success
2 CLI or input validation error
3 unexpected error
4 schema validation failed

Safety Notes

Input files must be regular files (no symlinks).
Log inputs are restricted to .jsonl; policy inputs to .json.
Maximum file size is 1 GB.

License

MIT. See LICENSE.

Cost Optimization Engine (Service)

The FastAPI cost optimization service has been consolidated here.

Location: services/cost-optimization-engine
Start the API:

cd services/cost-optimization-engine
python -m venv .venv
. .venv/Scripts/activate
pip install -e ".[dev]"

set DATABASE_URL=postgresql://user:pass@localhost:5432/cost_optimization_engine
toolkit-cost-optimizer serve --host 0.0.0.0 --port 8005

Migration Note

If you previously deployed production-ready/cost-optimization-engine, update paths to:

services/cost-optimization-engine

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.github		.github
docs/audits		docs/audits
schemas		schemas
services/cost-optimization-engine		services/cost-optimization-engine
src/toolkit_cost_latency_opt		src/toolkit_cost_latency_opt
tests		tests
.gitignore		.gitignore
AUDIT_REPORT_COST_LATENCY_OPT.md		AUDIT_REPORT_COST_LATENCY_OPT.md
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
ENHANCEMENTS.md		ENHANCEMENTS.md
LICENSE		LICENSE
README.md		README.md
RELEASING.md		RELEASING.md
SECURITY.md		SECURITY.md
VERSIONING.md		VERSIONING.md
pyproject.toml		pyproject.toml
pyrightconfig.json		pyrightconfig.json
requirements-dev.txt		requirements-dev.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Toolkit Cost and Latency Optimizer

What It Does

Install

Quick Start

Input Log Schema (JSONL)

Policy File (JSON)

Output

Exit Codes

Safety Notes

License

Cost Optimization Engine (Service)

Migration Note

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Toolkit Cost and Latency Optimizer

What It Does

Install

Quick Start

Input Log Schema (JSONL)

Policy File (JSON)

Output

Exit Codes

Safety Notes

License

Cost Optimization Engine (Service)

Migration Note

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages