Repository Guidelines

Project Structure & Module Organization

searchbench/cli.py is the CLI entry point (installed as the searchbench command). Core logic lives in searchbench/ (providers/, runner.py, judge.py, queries/, reporter.py). Reports and history are written to results/. Configuration templates live in .env.example, and timeouts in config.toml. Dependencies are pinned in requirements.txt.

Build, Test, and Development Commands

./.venv/bin/python -m pip install -r requirements.txt - install runtime dependencies.
./scripts/searchbench run - full evaluation run using the public query set.
./scripts/searchbench run --queries hard - run the hard, evidence-gated benchmark.
./scripts/searchbench quick - 10-query smoke check.
./scripts/searchbench history - list recent benchmark runs.
./scripts/searchbench summary - show the latest run summary table.
./scripts/searchbench report - open the latest HTML report.
./scripts/searchbench calibrate - suggest timeouts from historical latency data.
./scripts/searchbench debug --provider exa --queries hard --count 5 - dump raw provider responses for diagnostics.
Optional: python3 -m pip install -e . to enable the searchbench CLI.

Coding Style & Naming Conventions

Use 4-space indentation and standard Python naming: snake_case for functions and variables, CapWords for classes. Keep type hints where present and follow existing async patterns (async def, await). When adding providers, implement Provider from searchbench/providers/base.py and register via @register.

Testing Guidelines

Run tests with ./.venv/bin/python -m unittest discover -s tests. Validate changes with the CLI using ./scripts/searchbench quick before running a full benchmark.

Commit & Pull Request Guidelines

Commit messages are short and imperative; many use a lightweight scope prefix like docs: or cli:. For PRs, include a clear summary, commands run, and any new env vars added to .env.example. Attach a screenshot or pasted output for UI or visualization changes.

Security & Configuration Tips

Store API keys in .env and never commit it. Keep .env.example and README up to date when adding new providers or settings. Use config.toml for timeout overrides.

Landing the Plane (Session Completion)

When ending a work session, you MUST complete ALL steps below. Work is NOT complete until git push succeeds.

MANDATORY WORKFLOW:

File issues for remaining work - Create issues for anything that needs follow-up
Run quality gates (if code changed) - Tests, linters, builds
Update issue status - Close finished work, update in-progress items

PUSH TO REMOTE - This is MANDATORY:

git pull --rebase
bd sync
git push
git status  # MUST show "up to date with origin"

Clean up - Clear stashes, prune remote branches
Verify - All changes committed AND pushed
Hand off - Provide context for next session

CRITICAL RULES:

Work is NOT complete until git push succeeds
NEVER stop before pushing - that leaves work stranded locally
NEVER say "ready to push when you are" - YOU must push
If push fails, resolve and retry until it succeeds

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository Guidelines

Project Structure & Module Organization

Build, Test, and Development Commands

Coding Style & Naming Conventions

Testing Guidelines

Commit & Pull Request Guidelines

Security & Configuration Tips

Landing the Plane (Session Completion)

FilesExpand file tree

AGENTS.md

Latest commit

History

AGENTS.md

File metadata and controls

Repository Guidelines

Project Structure & Module Organization

Build, Test, and Development Commands

Coding Style & Naming Conventions

Testing Guidelines

Commit & Pull Request Guidelines

Security & Configuration Tips

Landing the Plane (Session Completion)