All notable changes to MCP Observatory will be documented in this file.
- CLI/MCP parity tests — automated tests verifying CLI and MCP server produce equivalent check results, artifact structures, and tool coverage.
- tools-invoke unit tests — 12 tests covering
isSafeToInvoke,stubFromSchema, andrunToolsInvokeCheckwith mocked clients. - HTTP adapter tests — 6 tests covering connection failures, auth tokens, headers, timeouts, and recording mode.
- CLI entrypoint tests — 18 tests covering
--version,--help, all subcommand help pages,--format json/markdown, fixture server runs, and error exits. - MCP
diff_runsformat param — accepts"json"or"markdown"(default), closing the last CLI/MCP output format parity gap. - Intentional differences documented — README and living test document all CLI/MCP parity gaps with explanations.
- MCP server: command allowlist — only
npx,node,python,python3,uvx,docker,deno,bunare permitted as base executables. Arbitrary command execution is blocked. Use the CLI for unrestricted commands. - GitHub Action: eliminate shell injection — all variable expansions now use bash arrays and quoted parameters. PR comments use
--body-fileinstead of inline--bodyto prevent content injection. - MCP server: path validation —
diff_runs,get_last_run,replay, andverifyfile paths are constrained to the runs/cassettes directory.suggest_serverscwdis constrained to the process working directory subtree. - Stderr buffer cap — adapter stderr collection capped at 500 lines to prevent unbounded memory growth.
- MCP server:
deepandsecurityparams —check_serverandscantools now acceptdeep(invoke safe tools) andsecurity(run security analysis) boolean parameters, closing the CLI/MCP parity gap. - MCP server: request logging — all tool calls log method name, status, and duration to stderr for observability.
- 17 new security tests — command allowlist, path traversal, and prefix-matching attack coverage.
get_last_runMCP tool no longer accepts a customrunsDirparameter (security: prevents arbitrary directory reads).
- Security scanning —
--securityflag analyzes tool schemas for shell injection, broad filesystem access, permissive schemas, and credential leakage in responses - GitHub Action — composite action for CI pipelines (
KryptosAI/mcp-observatory/action@main), comments markdown reports on PRs - Public dashboard — static HTML generator with server health table, SVG badges, trend visualization, and API JSON endpoint
- Matrix history tracking (last 90 runs) with trend dots on dashboard
- 14 new security-focused tests
scan deepnow enables security scanning by default
- full CLI/MCP server feature parity — every CLI command is now available as an MCP tool
suggestcommand and MCP tool for environment-aware MCP server recommendations- interactive arrow-key menu when invoked with no arguments
- q-quit and arrow key scrolling in interactive menu
- Glama MCP server card badge added to README
- interactive menu when invoked with no command
- help examples alignment for
npxprefix
- record/replay/verify: VCR-style testing for MCP servers
- cassette-based session capture and offline replay
verifycommand to check a live server against a recorded cassette
- natural language commands:
scan deep,diff a b,watch,test - flags replaced with positional words for better first-run experience
- MCP server mode via
servecommand suggest_serverstool: scans your project and recommends MCP servers from the registrytestcommand for single-server testing- server compatibility matrix documentation
- inline commands for
runandcheck
- scan output redesigned for instant time-to-value
- bold ASCII art logo on scan and help
- exit code 1 on failed runs
- copy-pasteable tip formatting
- HTTP/SSE adapter with streamable-http fallback
- HTML and Markdown report generation
- tool invocation checks (safe tools with no required params)
- schema drift detection via
diff - auto-discovery of MCP servers from Claude config files
- package published as
@kryptosai/mcp-observatoryon npm - README rewritten for clarity and first impressions
- packed-install verification that proves the CLI works from a release tarball
- real-server coverage matrix with checked-in artifacts
- release automation for npm publishing on tagged releases
- README repositioned around install proof and real evidence
- initial CLI with
run,diff, andreport - stable
1.0.0artifact schema with top-levelgate - local-process adapter built on the official MCP TypeScript SDK
- fixture server, sample artifacts, and Markdown reporting
- real-server smoke coverage for filesystem, everything, and ref-tools servers