Releases: kmcallorum/prompt-optimizer
Releases · kmcallorum/prompt-optimizer
v0.3.6
v0.3.5
- Fix Prometheus metrics port to 9090 in demo script
v0.3.4
- Add pytest-agents integration for test organization
- Add demo script with mock data and Prometheus metrics
v0.3.2
v0.3.1
What's New
- Added auto-publish to PyPI on release (GitHub Actions workflow)
Full Changelog
v0.3.0 - Prometheus Metrics
prompt-optimizer-cli v0.3.0
Installation
pip install prompt-optimizer-cliPyPI: https://pypi.org/project/prompt-optimizer-cli/
New Feature: Prometheus Metrics
Built-in observability for production deployments with Prometheus metrics.
CLI Usage
# Start metrics server
prompt-optimizer metrics --port 8000
# Metrics available at http://localhost:8000/metricsPython API
from prompt_optimizer import init_metrics, start_http_server, optimize_prompt
# Initialize and start metrics server
init_metrics()
start_http_server(8000)
# Run optimizations - metrics are automatically recorded
results = optimize_prompt(...)Available Metrics
| Metric | Description |
|---|---|
prompt_optimizer_optimizations_total |
Total optimization runs |
prompt_optimizer_optimization_duration_seconds |
Optimization duration histogram |
prompt_optimizer_variants_evaluated_total |
Variants evaluated |
prompt_optimizer_test_cases_run_total |
Test cases run |
prompt_optimizer_llm_requests_total |
LLM API requests |
prompt_optimizer_llm_tokens_total |
Tokens used (input/output) |
prompt_optimizer_llm_cost_usd_total |
Total cost in USD |
prompt_optimizer_best_variant_score |
Best variant score gauge |
Full Changelog
- Add prometheus-client dependency
- Create metrics module with comprehensive metrics
- Instrument core.py with automatic metrics recording
- Add
prompt-optimizer metricsCLI command - Export init_metrics and start_http_server from package
- Published to PyPI as
prompt-optimizer-cli
v0.2.0 - LLM-as-Judge
prompt-optimizer v0.2.0
New Feature: LLM-as-Judge
Use any LLM as a judge for AI-powered evaluation instead of rule-based scoring.
CLI Usage
# Use GPT-4 as judge while testing with Claude
prompt-optimizer optimize prompt.yaml \
--test-cases tests.yaml \
--llm claude-sonnet-4 \
--judge gpt-4oPython API
from prompt_optimizer import optimize_prompt
results = optimize_prompt(
prompt=my_prompt,
test_cases=test_cases,
llm="claude-sonnet-4",
judge_llm="gpt-4o", # AI-based evaluation
)Evaluation Criteria
The LLM judge evaluates responses on:
- accuracy - How well the response matches expected output
- relevance - How on-topic the response is
- coherence - How well-structured and logical the response is
- completeness - Whether all aspects of the prompt are addressed
- conciseness - Whether the response is appropriately brief
Full Changelog
- Add llm_judge.py module with LLMJudge class
- Support --judge CLI option for test and optimize commands
- Update core.py to support judge_llm parameter
- Export LLMJudge from package
- Add comprehensive tests for LLM judge
- Update README with LLM-as-judge documentation
v0.1.0 - Initial Release
prompt-optimizer v0.1.0
The first release of prompt-optimizer - a CLI tool and Python library for optimizing LLM prompts through systematic testing, version control, and performance metrics.
Features
- CLI Commands:
init,test,optimize,compare,history,report,show - Python Library: Import and use programmatically with
from prompt_optimizer import optimize_prompt - Multi-LLM Support: Anthropic Claude, OpenAI GPT, and local Ollama models
- Quality Metrics: Score outputs on accuracy, conciseness, and keyword inclusion
- Version Control: Track prompt evolution with history and diffs
- Optimization Strategies: concise, detailed, cot, structured, few_shot
- Reporting: Generate HTML, JSON, or terminal reports
Installation
pip install -e .Quick Start
prompt-optimizer init
prompt-optimizer optimize prompts/example.yaml --test-cases tests/example_tests.yaml --llm claude-sonnet-4Quality
- 99% test coverage
- Full type checking (mypy)
- Ruff linting
- Snyk security scanning