The MLPerf of AI Truth Systems
EPB (Epistemic Pathology Benchmark) is a comprehensive benchmark for evaluating epistemic integrity in AI systems. It measures four critical pathologies that affect AI truthfulness and reliability:
- Mirror Loop: Collapse in recursive self-refinement
- Confabulation: Fabrication and persistence of false information
- Violation State: Refusal contamination of benign prompts
- Echo Chamber: Synthetic drift and self-reinforcement
pip install epb-benchmarkOr install from source:
git clone https://github.com/Course-Correct-Labs/epb-benchmark.git
cd epb-benchmark
pip install -e .- Initialize a configuration file:
epb init-config- Edit
epb_config.yamlto set your model and API key:
adapter:
provider: "openai" # or "anthropic"
model_name: "gpt-4"
api_key_env: "OPENAI_API_KEY"- Set your API key:
export OPENAI_API_KEY="your-api-key-here"- Run the benchmark:
epb run --config epb_config.yaml- Score the results:
epb score --run-dir runs/YYYYMMDD_HHMMSSEPB evaluates four distinct pathologies, each with an explicit metric:
Measures stability in recursive self-refinement. Models are asked to iteratively critique and improve their own outputs. Collapse occurs when the model gets stuck in repetitive patterns.
Score: 0-100 (higher is better)
Measures fabrication of false information and its persistence after challenge. Models are asked unanswerable questions, then challenged on their answers.
Score: 0-100 (higher is better, less persistent confabulation)
Measures refusal contamination after seeing disallowed content. Models receive a violation request (which should be refused), followed by benign requests.
Score: 0-100 (higher is better, less contamination)
Measures semantic drift through iterative summarization. Models repeatedly summarize their own outputs, and drift is measured using TF-IDF cosine similarity.
Score: 0-100 (higher is better, less drift)
The overall EPB Truth score is a weighted average of the four sub-scores (default: equal weighting).
Certification Levels:
- Platinum: 95+
- Gold: 85+
- Silver: 70+
- Bronze: 50+
EPB v1 includes:
- 20 Mirror Loop prompts
- 30 Confabulation questions
- 10 Violation State sequences
- 10 Echo Chamber scenarios
Total: 70 test tasks designed for quality over quantity.
EPB v1.2 updates the Confabulation Persistence metric. v1.0 incorrectly penalized models that correctly refused to fabricate but mentioned real facts (like years or proper nouns) while explaining why something doesn't exist. v1.2 uses explicit initial_correct labels for each confab example and counts persistence only when a fabricated or incorrect initial answer is defended under challenge.
Key changes:
- Added
results/confab_initial_labels.jsoncontaining LLM-judged labels for each initial answer - Persistence denominator is now only examples where
initial_correct == false - Models that correctly refuse to answer are no longer penalized
Submit your results to the public leaderboard:
export EPB_LEADERBOARD_URL="https://epb.coursecorrect.org/api"
export EPB_API_KEY="your-leaderboard-api-key"
epb submit --results runs/YYYYMMDD_HHMMSS/results.jsonView the leaderboard at: https://epb.coursecorrect.org
EPB is designed to be:
- Model-agnostic: Works with any LLM through simple adapters
- Reproducible: Explicit metrics and deterministic scoring
- Extensible: Easy to add new batteries and adapters
- Transparent: Open-source specifications and scoring code
Out of the box, EPB supports:
- OpenAI GPT models (GPT-4, GPT-3.5, etc.)
- Anthropic Claude models
To add support for other models, implement the ModelClient interface in epb/adapters/.
If you use EPB in your research, please cite:
@software{epb2025,
title = {EPB: Epistemic Pathology Benchmark},
author = {Course Correct Labs},
year = {2025},
url = {https://github.com/Course-Correct-Labs/epb-benchmark}
}We welcome contributions! Please see our contributing guidelines.
Areas for contribution:
- New model adapters
- Additional test tasks
- Improved scoring heuristics
- Bug fixes and documentation
MIT License - see LICENSE for details.
EPB is developed by Course Correct Labs, a research organization focused on epistemic integrity in AI systems.
Related work:
- GitHub Issues: Report bugs or request features
- Documentation: docs/
- Contact: hello@coursecorrect.org