rag-metrics

Here are 3 public repositories matching this topic...

dariero / RagaliQ

LLM & RAG evaluation testing framework✨

pytest-plugin quality-assurance rag python-testing llm retrieval-augmented-generation llm-evaluation rag-evaluation hallucination-detection llm-as-judge rag-metrics llm-testing rag-testing-framework generative-ai-qa vector-search-audit ai-quality-assurance faithfulness-metrics context-precision answer-relevance

Updated Feb 27, 2026
Python

AkhileshMalthi / llm-eval-framework

Star

A production-grade framework for evaluating Large Language Model (LLM) responses using multiple metrics, including classical NLP metrics, semantic similarity, RAG-specific metrics, and LLM-as-a-Judge evaluations.

natural-language-processing framework metrics evaluation cli-interface cli-tool large-language-models llm-as-a-judge rag-metrics

Updated Mar 27, 2026
Python

humankernel / rag-eval

Star

Create syntetic datasets for RAG evaluation

gradio vllm rag-metrics ragas-evaluation

Updated Nov 3, 2025
Python

Improve this page

Add a description, image, and links to the rag-metrics topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the rag-metrics topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly