## Problem Need cost/quality comparison when evaluating DeepSeek vs Haiku vs Ollama. ## Acceptance Criteria - [ ] `llmint compare <run1> <run2>` compares two bench runs - [ ] Reports: cost, solve rate, cost per solve - [ ] Markdown table output --- *Generated by /forge cascade*
Problem
Need cost/quality comparison when evaluating DeepSeek vs Haiku vs Ollama.
Acceptance Criteria
llmint compare <run1> <run2>compares two bench runsGenerated by /forge cascade