Skip to content

Shift data aggregation from Ollama pipeline to MCP host (Claude)#65

Merged
valITino merged 3 commits intomainfrom
claude/optimize-mcp-performance-QgTSn
Mar 7, 2026
Merged

Shift data aggregation from Ollama pipeline to MCP host (Claude)#65
valITino merged 3 commits intomainfrom
claude/optimize-mcp-performance-QgTSn

Conversation

@valITino
Copy link
Owner

@valITino valITino commented Mar 7, 2026

Summary

This PR fundamentally restructures how blhackbox processes pentest data. Instead of sending raw tool outputs to a 3-agent Ollama preprocessing pipeline (Ingestion → Processing → Synthesis), the MCP host (Claude Code, Claude Desktop, or ChatGPT) now handles parsing, deduplication, correlation, and structuring directly. The Ollama pipeline is moved to an optional legacy fallback (--profile ollama).

Key Changes

Core Architecture

  • Reduced default stack from 9 containers to 4 (kali-mcp, wire-mcp, screenshot-mcp, portainer)
  • Ollama pipeline (ollama-mcp, agent-ingestion, agent-processing, agent-synthesis, ollama) now optional via --profile ollama
  • Reduced RAM requirement from 16 GB to 8 GB for core stack; 16 GB+ only needed if using Ollama pipeline

MCP Server Changes (blhackbox/mcp/server.py)

  • Added two new tools:
    • aggregate_results(payload) — validates and persists the AggregatedPayload JSON, stores to disk, optionally syncs to Neo4j
    • get_payload_schema() — returns the AggregatedPayload JSON schema so Claude knows the expected structure
  • Updated get_template description to reflect that Claude handles aggregation directly
  • Removed references to process_scan_results() (Ollama pipeline method)

Data Model Updates (blhackbox/models/aggregated_payload.py)

  • Updated docstring: AggregatedPayload is now produced by the MCP host, not the Ollama pipeline
  • Added model field to AggregatedMetadata to track which model performed aggregation (Claude, Ollama, etc.)
  • Kept ollama_model field for backward compatibility but marked as deprecated

Workflow Documentation

  • Updated claude_playbook.md: Phase 4 renamed from "Process" to "Aggregate"; Claude now does the work directly
  • Updated all prompt templates (full-pentest, api-security, bug-bounty, etc.): removed references to process_scan_results(), replaced with get_payload_schema() + aggregate_results() pattern
  • Updated README.md: clarified that core stack is 4 containers, Ollama is optional, reduced RAM requirements

Docker & Deployment

  • docker-compose.yml: Ollama services moved to --profile ollama; MCP Gateway now only depends on core 3 MCP servers (removed ollama-mcp dependency)
  • Makefile: added up-ollama target; updated container counts in help text; updated down and clean to include --profile ollama
  • .env.example: commented out all Ollama-related settings; noted they're only needed for --profile ollama
  • blhackbox-mcp-catalog.yaml: commented out ollama-mcp registry entry (can be uncommented if using legacy pipeline)
  • docker/claude-code-entrypoint.sh: Ollama Pipeline check now warns instead of failing if not running

Entrypoint & Startup

  • Claude Code startup script now shows Ollama Pipeline as optional (WARN status if not running, rather than failure)
  • Health check output updated to show 3 core services instead of 4

Implementation Details

  • aggregate_results() validates the payload against the Pydantic schema, persists to results_dir/session-{session_id}.json, and attempts Neo4j storage (best-effort, doesn't fail if Neo4j unavailable)
  • The tool returns a summary with host/vulnerability/endpoint counts and a hint for report generation
  • get_payload_schema() returns the full JSON schema from AggregatedPayload.model_json_schema() for Claude to reference
  • All prompt templates now follow the same pattern: collect raw outputs → call get_payload_schema() → parse/deduplicate/correlate → call aggregate_results() → call generate_report()
  • Backward compatibility maintained: Ollama pipeline still

https://claude.ai/code/session_01MXWTGUUSheo3EkgHrzRmjy

claude added 3 commits March 7, 2026 07:21
…n directly

The MCP host (Claude Code, Claude Desktop, ChatGPT) is dramatically more
capable than llama3.1:8b at parsing, deduplication, and synthesis. This
change eliminates 2-5 minutes of Ollama latency per scan by having the
MCP host structure raw tool outputs directly.

Changes:
- Add aggregate_results and get_payload_schema tools to blhackbox MCP server
- Move all Ollama services (ollama, ollama-mcp, 3 agents) to --profile ollama
- Core stack reduced from 9 to 4 containers (kali, wire, screenshot, portainer)
- RAM requirement reduced from 16GB to 8GB for core stack
- Update all 11 prompt templates to use direct aggregation
- Update playbook, entrypoint, Makefile, .env.example, README, CLAUDE.md
- Keep Ollama pipeline as optional fallback (make up-ollama)

https://claude.ai/code/session_01MXWTGUUSheo3EkgHrzRmjy
- Add aggregate_results and get_payload_schema to expected tool sets
- Update tool counts from 11 to 13 in test_mcp_server and test_screenshot_mcp
- Update test_prompts to check for aggregation pipeline instead of Ollama

https://claude.ai/code/session_01MXWTGUUSheo3EkgHrzRmjy
@valITino valITino marked this pull request as ready for review March 7, 2026 08:29
@valITino valITino merged commit 7d0e783 into main Mar 7, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants