Skip to content

Benchmark Script for local models #19

@LeeSinLiang

Description

@LeeSinLiang

Benchmark Script for local models

Summary
Add scripts/benchmark.py to compare local HF models on a fixed diff corpus.

Scope

  • Inputs: a small corpus of representative diffs (JSONL).
  • Outputs: CSV with model, tokens_in, tokens_out, latency_ms, provider_device.
  • Optional: save a plot to docs/benchmarks/.

Tasks

  • Implement benchmark harness with warmup and N runs.
  • Detect device (CUDA/MPS/CPU) and memory optimization flags.
  • Add sample corpus + instructions.
  • Document results format in README.

Acceptance criteria

  • Running the script produces a CSV and a short summary line per model.
  • Works without cloud keys (local models only by default).

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions