You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat: improve discoverability for registries and AI agents (#62)
- Lead README with "first testing tool that is itself an MCP server"
- Add npm downloads badge and Smithery badge
- Add smithery.yaml for Smithery registry listing
- Update server.json to v0.8.2 with agent-optimized description
- Rewrite all MCP tool descriptions for agent self-discovery
- Add keywords: mcp-server, ai-agent, ai-tools, developer-tools, ci-cd, schema-drift
- Update MCP Server Mode section with "when to use" table
- Bold "Works as MCP server" in comparison table
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Find problems in your MCP servers before your users do.
21
+
**The first testing tool that is itself an MCP server.** Your AI agent can scan, test, record, replay, and verify other MCP servers autonomously — catching regressions, schema drift, and security issues without human intervention.
20
22
21
-
You update a server, a tool silently breaks, and your agent starts failing. MCP Observatory catches that. It connects to your servers, checks every capability, actually calls tools to make sure they work, and diffs runs to catch what changed.
23
+
Use it as a CLI, a CI action, or give it to your agent as an MCP server and let it test your other servers for you.
@@ -164,19 +166,25 @@ The action runs checks on every PR, comments a markdown report, and blocks merge
164
166
165
167
## MCP Server Mode
166
168
167
-
When running as an MCP server (`serve`), your AI agent gets the same capabilities as the CLI:
169
+
**No other testing tool is itself an MCP server.** Add Observatory as a server and your AI agent can autonomously test, diagnose, and monitor your other MCP servers.
168
170
169
-
| Tool | What it does |
170
-
|------|-------------|
171
-
| `scan` | Discover and check all configured servers |
172
-
| `check_server` | Check a specific server by command |
173
-
| `record` | Record a server session to a cassette file |
174
-
| `replay` | Replay a cassette offline — no live server needed |
175
-
| `verify` | Verify a live server still matches a cassette |
176
-
| `watch` | Run checks and diff against the previous run |
177
-
| `diff_runs` | Compare two saved run artifacts |
178
-
| `get_last_run` | Return the most recent run for a target |
179
-
| `suggest_servers` | Scan your environment and recommend servers you're missing |
171
+
```bash
172
+
claude mcp add mcp-observatory -- npx -y @kryptosai/mcp-observatory serve
173
+
```
174
+
175
+
Your agent gets 9 tools:
176
+
177
+
| Tool | When to use it |
178
+
|------|---------------|
179
+
| `scan` | Check if all your configured MCP servers are healthy |
180
+
| `check_server` | Test a specific server before installing or after updating |
181
+
| `record` | Capture a baseline of a working server for future comparison |
182
+
| `replay` | Test against a recorded session — no live server needed |
183
+
| `verify` | Confirm a server update didn't break anything |
184
+
| `watch` | Check a server and see what changed since the last check |
185
+
| `diff_runs` | Find regressions between two check results |
186
+
| `get_last_run` | Retrieve previous check results for a server |
187
+
| `suggest_servers` | Discover MCP servers that match your project stack |
180
188
181
189
An AI tool that checks other AI tools. It's a tool testing tools that serve tools.*
182
190
@@ -259,7 +267,7 @@ npx @kryptosai/mcp-observatory run --target ./target.json
259
267
| Benchmarking / latency | — | — | ✅ | — |
260
268
| Jest integration | — | — | — | ✅ |
261
269
| MCP proxy mode | — | ✅ | — | — |
262
-
| Works as MCP server | ✅ | — | — | — |
270
+
| **Works as MCP server** | **✅** | — | — | — |
263
271
264
272
Each tool has strengths. Observatory focuses on regression detection and CI-friendly workflows. mcp-recorder is great as a transparent proxy. MCPBench is the go-to for performance benchmarking. mcp-jest is ideal if you're already in a Jest workflow.
"description": "The first testing tool that is itself an MCP server. AI agents can scan, test, record, replay, and verify other MCP servers autonomously — catching regressions, schema drift, and security issues without human intervention.",
Copy file name to clipboardExpand all lines: src/server.ts
+10-10Lines changed: 10 additions & 10 deletions
Original file line number
Diff line number
Diff line change
@@ -97,7 +97,7 @@ export async function startServer(): Promise<void> {
97
97
98
98
server.tool(
99
99
"scan",
100
-
"Auto-discover MCP servers from config files and run checks against each one. Returns a summary of tools/prompts/resources status for every discovered server.",
100
+
"Use this to check if all your MCP servers are healthy. Auto-discovers servers from Claude config files, connects to each one, and verifies tools/prompts/resources respond correctly. Use with deep=true to also invoke tools and confirm they actually execute. Returns pass/fail status for every server.",
101
101
{
102
102
config: z.string().optional().describe("Path to a specific MCP config file. If omitted, scans default locations."),
103
103
deep: z.boolean().optional().describe("Also invoke safe tools to verify they execute."),
@@ -136,7 +136,7 @@ export async function startServer(): Promise<void> {
136
136
137
137
server.tool(
138
138
"check_server",
139
-
"Run checks against a specific MCP server by command. Example: check_server({ command: 'npx -y @modelcontextprotocol/server-everything' })",
139
+
"Use this to test a specific MCP server before installing or after updating it. Launches the server by command, checks all capabilities, and saves a run artifact for future comparison. Example: check_server({ command: 'npx -y @modelcontextprotocol/server-everything' }). Use deep=true to invoke tools, security=true to analyze schemas for vulnerabilities.",
140
140
{
141
141
command: z.string().describe("The command to launch the MCP server (e.g. 'npx -y @modelcontextprotocol/server-everything')."),
142
142
args: z.array(z.string()).optional().describe("Additional arguments for the command."),
@@ -174,7 +174,7 @@ export async function startServer(): Promise<void> {
174
174
175
175
server.tool(
176
176
"score_server",
177
-
"Score an MCP server's health (0-100) including protocol compliance, schema quality, security, reliability, and performance. Returns grade A-F with detailed breakdown.",
177
+
"Use this to get a quick health grade for an MCP server. Runs all checks (capabilities, tool invocation, security) and returns a 0-100 score with A-F grade and detailed breakdown across protocol compliance, schema quality, security, reliability, and performance.",
178
178
{
179
179
command: z.string().describe("The command to launch the MCP server."),
180
180
args: z.array(z.string()).optional().describe("Additional arguments for the command."),
@@ -224,7 +224,7 @@ export async function startServer(): Promise<void> {
224
224
225
225
server.tool(
226
226
"diff_runs",
227
-
"Compare two run artifact files and return the diff showing regressions, recoveries, and schema drift.",
227
+
"Use this to find what changed between two server checks. Compares two run artifacts and surfaces regressions (things that broke), recoveries (things that got fixed), schema drift (added/removed/changed tool parameters), and gate status changes. Essential after updating a server.",
228
228
{
229
229
base: z.string().describe("Path to the base run artifact JSON file."),
230
230
head: z.string().describe("Path to the head run artifact JSON file."),
@@ -260,7 +260,7 @@ export async function startServer(): Promise<void> {
260
260
261
261
server.tool(
262
262
"get_last_run",
263
-
"Return the most recent run artifact for a given target ID. Searches the default runs directory.",
263
+
"Use this to retrieve the last check results for a server. Finds the most recent run artifact by target ID so you can review previous results or diff against a new run.",
264
264
{
265
265
targetId: z.string().describe("The target ID to find the last run for (e.g. server name or command)."),
266
266
},
@@ -305,7 +305,7 @@ export async function startServer(): Promise<void> {
305
305
306
306
server.tool(
307
307
"suggest_servers",
308
-
"Gather context about the current environment to help recommend MCP servers. Returns currently configured servers, detected languages/frameworks/databases/services, and available servers from the MCP registry.",
308
+
"Use this when setting up a project or wondering what MCP servers to add. Scans the working directory for languages, frameworks, databases, and cloud providers, lists currently configured servers, and cross-references the MCP registry to recommend servers you're missing.",
309
309
{
310
310
cwd: z.string().optional().describe("Working directory to scan for environment signals. Defaults to process.cwd()."),
311
311
},
@@ -412,7 +412,7 @@ export async function startServer(): Promise<void> {
412
412
413
413
server.tool(
414
414
"record",
415
-
"Record a live MCP server session to a cassette file. The cassette captures all JSON-RPC traffic and can be replayed offline or used to verify future server versions.",
415
+
"Use this to capture a baseline of a working MCP server. Records all JSON-RPC traffic to a cassette file that can be replayed offline (no server needed) or used to verify future versions haven't broken anything. Like VCR for MCP.",
416
416
{
417
417
command: z.string().describe("The command to launch the MCP server."),
418
418
args: z.array(z.string()).optional().describe("Additional arguments for the command."),
@@ -457,7 +457,7 @@ export async function startServer(): Promise<void> {
457
457
458
458
server.tool(
459
459
"replay",
460
-
"Replay a cassette file offline — no live server needed. Runs all checks against the recorded responses.",
460
+
"Use this to test a server without running it. Replays a previously recorded cassette offline and runs all checks against the recorded responses. Useful in CI or when the live server is unavailable.",
461
461
{
462
462
cassette: z.string().describe("Path to a cassette JSON file."),
463
463
},
@@ -510,7 +510,7 @@ export async function startServer(): Promise<void> {
510
510
511
511
server.tool(
512
512
"verify",
513
-
"Verify a live server still matches a recorded cassette. Connects to the server, replays the same requests, and compares responses.",
513
+
"Use this after updating a server to confirm nothing broke. Connects to the live server, sends the same requests from a recorded cassette, and compares responses. Reports exactly what changed — added tools, removed parameters, different response shapes.",
514
514
{
515
515
cassette: z.string().describe("Path to a cassette JSON file."),
516
516
command: z.string().describe("The command to launch the MCP server."),
@@ -554,7 +554,7 @@ export async function startServer(): Promise<void> {
554
554
555
555
server.tool(
556
556
"watch",
557
-
"Run checks against a server repeatedly and report when results change. Returns the initial check and starts monitoring. Note: in MCP server mode this runs a single check and diff against the previous run rather than a persistent loop.",
557
+
"Use this to check a server and see what changed since the last check. Runs all checks, saves the result, and diffs against the previous run for the same target. Shows regressions, recoveries, and schema drift in one call.",
558
558
{
559
559
command: z.string().describe("The command to launch the MCP server."),
560
560
args: z.array(z.string()).optional().describe("Additional arguments for the command."),
0 commit comments