Shift data aggregation from Ollama pipeline to MCP host (Claude) by valITino · Pull Request #65 · valITino/blhackbox

valITino · 2026-03-07T07:35:51Z

Summary

This PR fundamentally restructures how blhackbox processes pentest data. Instead of sending raw tool outputs to a 3-agent Ollama preprocessing pipeline (Ingestion → Processing → Synthesis), the MCP host (Claude Code, Claude Desktop, or ChatGPT) now handles parsing, deduplication, correlation, and structuring directly. The Ollama pipeline is moved to an optional legacy fallback (--profile ollama).

Key Changes

Core Architecture

Reduced default stack from 9 containers to 4 (kali-mcp, wire-mcp, screenshot-mcp, portainer)
Ollama pipeline (ollama-mcp, agent-ingestion, agent-processing, agent-synthesis, ollama) now optional via --profile ollama
Reduced RAM requirement from 16 GB to 8 GB for core stack; 16 GB+ only needed if using Ollama pipeline

MCP Server Changes (blhackbox/mcp/server.py)

Added two new tools:
- aggregate_results(payload) — validates and persists the AggregatedPayload JSON, stores to disk, optionally syncs to Neo4j
- get_payload_schema() — returns the AggregatedPayload JSON schema so Claude knows the expected structure
Updated get_template description to reflect that Claude handles aggregation directly
Removed references to process_scan_results() (Ollama pipeline method)

Data Model Updates (blhackbox/models/aggregated_payload.py)

Updated docstring: AggregatedPayload is now produced by the MCP host, not the Ollama pipeline
Added model field to AggregatedMetadata to track which model performed aggregation (Claude, Ollama, etc.)
Kept ollama_model field for backward compatibility but marked as deprecated

Workflow Documentation

Updated claude_playbook.md: Phase 4 renamed from "Process" to "Aggregate"; Claude now does the work directly
Updated all prompt templates (full-pentest, api-security, bug-bounty, etc.): removed references to process_scan_results(), replaced with get_payload_schema() + aggregate_results() pattern
Updated README.md: clarified that core stack is 4 containers, Ollama is optional, reduced RAM requirements

Docker & Deployment

docker-compose.yml: Ollama services moved to --profile ollama; MCP Gateway now only depends on core 3 MCP servers (removed ollama-mcp dependency)
Makefile: added up-ollama target; updated container counts in help text; updated down and clean to include --profile ollama
.env.example: commented out all Ollama-related settings; noted they're only needed for --profile ollama
blhackbox-mcp-catalog.yaml: commented out ollama-mcp registry entry (can be uncommented if using legacy pipeline)
docker/claude-code-entrypoint.sh: Ollama Pipeline check now warns instead of failing if not running

Entrypoint & Startup

Claude Code startup script now shows Ollama Pipeline as optional (WARN status if not running, rather than failure)
Health check output updated to show 3 core services instead of 4

Implementation Details

aggregate_results() validates the payload against the Pydantic schema, persists to results_dir/session-{session_id}.json, and attempts Neo4j storage (best-effort, doesn't fail if Neo4j unavailable)
The tool returns a summary with host/vulnerability/endpoint counts and a hint for report generation
get_payload_schema() returns the full JSON schema from AggregatedPayload.model_json_schema() for Claude to reference
All prompt templates now follow the same pattern: collect raw outputs → call get_payload_schema() → parse/deduplicate/correlate → call aggregate_results() → call generate_report()
Backward compatibility maintained: Ollama pipeline still

https://claude.ai/code/session_01MXWTGUUSheo3EkgHrzRmjy

…n directly The MCP host (Claude Code, Claude Desktop, ChatGPT) is dramatically more capable than llama3.1:8b at parsing, deduplication, and synthesis. This change eliminates 2-5 minutes of Ollama latency per scan by having the MCP host structure raw tool outputs directly. Changes: - Add aggregate_results and get_payload_schema tools to blhackbox MCP server - Move all Ollama services (ollama, ollama-mcp, 3 agents) to --profile ollama - Core stack reduced from 9 to 4 containers (kali, wire, screenshot, portainer) - RAM requirement reduced from 16GB to 8GB for core stack - Update all 11 prompt templates to use direct aggregation - Update playbook, entrypoint, Makefile, .env.example, README, CLAUDE.md - Keep Ollama pipeline as optional fallback (make up-ollama) https://claude.ai/code/session_01MXWTGUUSheo3EkgHrzRmjy

https://claude.ai/code/session_01MXWTGUUSheo3EkgHrzRmjy

- Add aggregate_results and get_payload_schema to expected tool sets - Update tool counts from 11 to 13 in test_mcp_server and test_screenshot_mcp - Update test_prompts to check for aggregation pipeline instead of Ollama https://claude.ai/code/session_01MXWTGUUSheo3EkgHrzRmjy

claude added 3 commits March 7, 2026 07:21

Remove unused pathlib.Path import in server.py

a34966f

https://claude.ai/code/session_01MXWTGUUSheo3EkgHrzRmjy

valITino marked this pull request as ready for review March 7, 2026 08:29

valITino merged commit 7d0e783 into main Mar 7, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Shift data aggregation from Ollama pipeline to MCP host (Claude)#65

Shift data aggregation from Ollama pipeline to MCP host (Claude)#65
valITino merged 3 commits intomainfrom
claude/optimize-mcp-performance-QgTSn

valITino commented Mar 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

valITino commented Mar 7, 2026

Summary

Key Changes

Implementation Details

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants