MCP Inspector Assessment (Experimental, Archived)

An experimental fork of Anthropic's MCP Inspector that explored automated stress-testing and security assessment of MCP servers.

Status: Archived

This project is no longer actively developed. It served as an experimental exploration of whether automated injection testing could effectively validate MCP server security. The findings from this work informed subsequent approaches to MCP server review.

What This Was

This project extended the MCP Inspector with an automated assessment framework that attempted to validate MCP servers by:

Connecting to a running server and invoking each tool with injection payloads (command injection, SQL injection, path traversal, SSRF, code execution)
Analyzing tool responses for evidence of actual code execution vs safe data reflection
Scoring servers across functionality, security, error handling, documentation, and protocol compliance
Providing a CLI for CI/CD integration alongside the web UI

The assessment engine included 18 modules, 31 attack patterns, temporal/rug-pull detection, cross-tool chain testing, and a weighted scoring rubric.

What We Learned

The core hypothesis was that automated injection testing could catch real MCP server vulnerabilities. After extensive testing — including building a dedicated vulnerability testbed — we found that this approach has fundamental limitations:

The real attack surface is the LLM, not the tool's input handler. MCP exploits primarily happen through the model's interaction with tool descriptions and outputs — prompt injection via hidden instructions in descriptions, context poisoning through tool outputs, and social engineering through misleading metadata. Sending whoami to a tool doesn't test any of this.
Injection testing only catches what it's designed for. The testbed tools were purpose-built to be vulnerable to injection payloads (eval(), subprocess.run(shell=True)). Real-world MCP servers rarely have these obvious patterns, and the ones that do are easily caught by static code analysis without needing to invoke tools.
AI-based review outperforms automated testing for this domain. Having an LLM read tool descriptions, source code, and manifests — and judge them holistically — catches a broader class of issues (AUP violations, misleading descriptions, undisclosed telemetry, social engineering) that automated testing structurally cannot detect.
Behavioral testing adds proof to findings already made by static analysis. In testing against a 59-tool vulnerability testbed, every tool flagged by injection testing was also identifiable from source code patterns. The behavioral evidence was confirmatory, not additive.

These findings led to a shift toward AI-based review pipelines for MCP server validation, where the model evaluates servers the way it would actually interact with them.

Repository Structure

This is a monorepo with three workspaces:

client/ — React frontend (Vite, TypeScript, Tailwind)
server/ — Express backend (TypeScript)
cli/ — Command-line assessment runner (TypeScript)

Original Repository: https://github.com/modelcontextprotocol/inspector

Running Locally

If you want to explore the assessment framework:

git clone https://github.com/triepod-ai/inspector-assessment.git
cd inspector-assessment
npm install
npm run build
npm run dev          # Web UI at http://localhost:6274

CLI assessment:

npm run assess -- --server <name> --config <path-to-config.json>

Acknowledgments

This project builds on the foundation provided by Anthropic's MCP Inspector team. The original inspector remains the recommended tool for MCP server debugging and development: https://github.com/modelcontextprotocol/inspector

License

This project is licensed under the MIT License — see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 2,935 Commits
.github		.github
.husky		.husky
cli		cli
client		client
docs		docs
scripts		scripts
server		server
.dockerignore		.dockerignore
.env		.env
.env.example		.env.example
.git-blame-ignore-revs		.git-blame-ignore-revs
.gitattributes		.gitattributes
.gitignore		.gitignore
.mcp.json		.mcp.json
.node-version		.node-version
.npmignore		.npmignore
.npmrc		.npmrc
.prettierignore		.prettierignore
.prettierrc		.prettierrc
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
PROJECT_STATUS.md		PROJECT_STATUS.md
Profile.pdf		Profile.pdf
README.md		README.md
SECURITY.md		SECURITY.md
UPSTREAM_SYNC.md		UPSTREAM_SYNC.md
enhanced-detection-v2.ts		enhanced-detection-v2.ts
inspector.code-workspace		inspector.code-workspace
mcp-inspector.png		mcp-inspector.png
mcp-stdio-harness.ts		mcp-stdio-harness.ts
package-lock.json		package-lock.json
package.json		package.json
run-assess-full.cjs		run-assess-full.cjs
sample-config.json		sample-config.json
test-usability-enhancement.js		test-usability-enhancement.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MCP Inspector Assessment (Experimental, Archived)

Status: Archived

What This Was

What We Learned

Repository Structure

Running Locally

Acknowledgments

License

About

Uh oh!

Releases 3

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

MCP Inspector Assessment (Experimental, Archived)

Status: Archived

What This Was

What We Learned

Repository Structure

Running Locally

Acknowledgments

License

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages