Evaluate MCP servers with Model Context Protocol Benchmark Runner
cli benchmarking evaluations benchmarks ai-tools anthropic llm-agents mcp-server swe-bench claude-code-plugin claude-code-skills cybergym
-
Updated
Jan 28, 2026 - Python