大语言模型评估平台,支持多种评估基准、自定义数据集和性能测试。支持基于自定义数据集的RAG评估。
-
Updated
Aug 20, 2025 - Python
大语言模型评估平台,支持多种评估基准、自定义数据集和性能测试。支持基于自定义数据集的RAG评估。
Add a description, image, and links to the evalscope topic page so that developers can more easily learn about it.
To associate your repository with the evalscope topic, visit your repo's landing page and select "manage topics."