Skip to content

Conversation

@pingSubhajit
Copy link
Contributor

Summary

This release introduces a new evaluation harness battery for measuring retrieval quality and fixes URL processing for image embeddings.

Changes

New Feature: Evaluation Harness Battery (#21)

Adds a comprehensive evaluation framework for measuring and tracking retrieval quality over time:

  • Eval Runner: Deterministic evaluation runner with support for CI thresholds and baseline comparisons
  • Dataset Management: Utilities for creating, loading, and managing evaluation datasets
  • Metrics: Built-in retrieval metrics including precision, recall, MRR, and NDCG
  • Report Generation: JSON report output with diff comparison against baseline runs
  • CLI Integration: npx unrag add battery eval command to scaffold eval infrastructure
  • Preset Support: Batteries are now included in preset installations

Documentation added:

  • Getting started guide
  • Dataset creation and management
  • Available metrics reference
  • Running evaluations
  • Comparing runs and baselines
  • CI integration guide

Note: This feature is marked as experimental.

Bug Fix: Image Embed URL Fetch (#20)

Fixed an issue where URL processing within image embeddings was not properly routed through the existing fetch policy.

Files Changed

  • 41 files changed
  • ~4,800 lines added
  • ~370 lines removed (spec file moved/cleaned up)

Testing

  • Added test coverage for eval battery CLI scaffolding
  • Added tests for eval dataset, metrics, and runner thresholds
  • Added tests for image embed URL fetch behavior
  • Added init command tests

#21)

* feat: add eval harness battery with docs
* docs: mark eval feature as experimental
* chore: remove spec for eval harness feature
* fix: documentation drift for eval feature
* feat: include batteries in preset installs
@pingSubhajit pingSubhajit self-assigned this Jan 9, 2026
@vercel
Copy link

vercel bot commented Jan 9, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Review Updated (UTC)
unrag-web Ready Ready Preview, Comment Jan 9, 2026 8:43pm

@pingSubhajit pingSubhajit merged commit b8f6dd6 into main Jan 9, 2026
3 checks passed
@pingSubhajit pingSubhajit deleted the release/v0.2.9 branch January 9, 2026 20:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants