Skip to content

Latest commit

 

History

History
449 lines (357 loc) · 15.9 KB

File metadata and controls

449 lines (357 loc) · 15.9 KB

AI Infrastructure

Central infrastructure for AI tools, MCP servers, gateways, and platform services.

TODO

  • Create an ai-infrastructure skill (in ~/AI/skills/ or in this repo) that documents all services, ports, startup commands, and dependencies so AI agents know what infrastructure is available when setting up machines or debugging connectivity issues.

Architecture

@startuml ai-infrastructure
!theme plain
skinparam backgroundColor #FEFEFE
skinparam componentStyle rectangle
skinparam defaultFontName Consolas

title AI Infrastructure Architecture

' === AI Clients ===
package "AI Clients" as clients #E3F2FD {
  [VS Code Copilot] as vscode
  [Claude Desktop] as claude
  [Cline] as cline
  [Other Clients] as other
}

' === Gateway Layer ===
package "Gateway Layer" as gateway_layer #E8F5E9 {
  component "agentgateway\n:3847 (HTTP)\n:15001 (Admin UI)\n:15020 (Metrics)" as agentgateway #C8E6C9
  component "nginx-proxy\n:3443 (HTTPS)\n:9223 (CDP)" as nginx #A5D6A7
}

' === MCP Backends - Currently Running ===
package "Running MCPs" as running #FFF3E0 {
  component "stdio-proxy\n:7030 (SSE)\n:61822 (Kapture WS)" as stdio_proxy #FFCC80

  package "stdio MCPs" as stdio_mcps #FFE0B2 {
    [sequential-thinking\n(1 tool)] as seq_think
    [memory\n(8 tools)] as memory
    [kapture\n(15+ tools)] as kapture_mcp
  }
}

' === SSE-based MCPs - Running ===
package "SSE MCPs" as sse_mcps #FFF3E0 {
  [context7\n:7008] as context7 #FFCC80
  [playwright\n:7007] as playwright #FFCC80
  [browser-use\n:7011] as browser_use #FFCC80
  [hass-mcp\n:7010] as hass_mcp #FFCC80
  [qdrant-mcp\n:7020] as qdrant_mcp #FFCC80
}

' === Platform Services ===
package "Platform Services" as platform #F3E5F5 {
  [Langfuse\n:3100 (UI)\n(LLM Observability)] as langfuse #CE93D8
  [langfuse-mcp\n:7012] as langfuse_mcp #E1BEE7
}

' === Observability Stack ===
package "Observability" as observability #E1F5FE {
  component "OpenTelemetry\nCollector\n:4317/:4318 (internal)" as otel #81D4FA
  component "Jaeger\n:16686 (UI)" as jaeger #4FC3F7
  component "Prometheus\n:9090" as prometheus #29B6F6
  component "Grafana\n:3000" as grafana #03A9F4
}

' === Connections ===
' Clients to Gateway
vscode --> agentgateway : HTTP\nx-client-id
claude --> agentgateway : HTTP\nx-client-id
cline --> agentgateway : HTTP\nx-client-id
other --> nginx : HTTPS

' nginx to agentgateway
nginx --> agentgateway : proxy

' agentgateway to MCPs
agentgateway --> stdio_proxy : SSE
agentgateway --> context7 : SSE
agentgateway --> playwright : SSE
agentgateway --> browser_use : SSE
agentgateway --> hass_mcp : SSE
agentgateway --> qdrant_mcp : SSE
agentgateway --> langfuse_mcp : MCP

' stdio-proxy to stdio MCPs
stdio_proxy --> seq_think : stdio
stdio_proxy --> memory : stdio
stdio_proxy --> kapture_mcp : stdio

' Langfuse MCP to Langfuse platform
langfuse_mcp --> langfuse : API

' Browser to Langfuse
browser .up.> langfuse : UI

' Observability connections
agentgateway --> otel : OTLP traces
otel --> jaeger : traces
otel --> prometheus : span metrics
agentgateway --> prometheus : metrics scrape
grafana --> prometheus : query
grafana --> jaeger : query

' Legend
legend right
  |= Color |= Status |
  | <#C8E6C9> | Gateway |
  | <#FFCC80> | Running MCP |
  | <#81D4FA> | Observability |
  | <#CE93D8> | Platform Service |
  | <#E1BEE7> | Planned |
endlegend

@enduml

Current Status

Component Status Tools
agentgateway ✅ Running -
sequential-thinking ✅ Running 1
memory ✅ Running 8
kapture ✅ Running 15+
context7 ✅ Running 2
playwright ✅ Running 15+
browser-use ✅ Running 10+
hass-mcp ✅ Running 5+
qdrant-mcp ✅ Running 6
langfuse ✅ Available -
Total 62+ tools

Network Architecture

@startuml network-ports
!theme plain
skinparam backgroundColor #FEFEFE
skinparam defaultFontName Consolas
skinparam rectangleBorderColor #666666
skinparam rectangleBackgroundColor #F5F5F5

title Host ↔ Docker Port Mappings

rectangle "HOST" as host {
  rectangle "AI Clients" as clients #E3F2FD
  rectangle "Chrome :9222" as chrome #FFF9C4
  rectangle "Browser" as browser #E1F5FE
}

rectangle "DOCKER (ai-infrastructure network)" as docker {
  rectangle ":3847 agentgateway" as gw #C8E6C9
  rectangle ":15001 Admin UI" as admin #C8E6C9
  rectangle ":3443/:9223 nginx" as nginx #A5D6A7
  rectangle ":7030/:61822 stdio-proxy" as stdio #FFCC80
  rectangle ":7020 qdrant-mcp" as qdrant #FFCC80
  rectangle ":3000 Grafana" as grafana #81D4FA
  rectangle ":9090 Prometheus" as prom #81D4FA
  rectangle ":16686 Jaeger" as jaeger #81D4FA
  rectangle ":3100 Langfuse" as langfuse #CE93D8
}

clients -down-> gw : "HTTP :3847"
clients -down-> admin : "HTTP :15001"
browser -down-> grafana : ":3000"
browser -down-> prom : ":9090"
browser -down-> jaeger : ":16686"
browser -down-> langfuse : ":3100"
nginx -up-> chrome : "CDP :9223→:9222"

@enduml

Internal Docker Network (ai-infrastructure):

From To Port Purpose
agentgateway stdio-proxy 7030 SSE to stdio MCPs
agentgateway otel-collector 4317 OTLP traces
otel-collector jaeger 14317 Trace export
otel-collector (self) 8889 Span metrics
prometheus agentgateway 15020 Metrics scrape
prometheus otel-collector 8889 Span metrics scrape
grafana prometheus 9090 Metrics queries
grafana jaeger 16686 Trace queries

Directory Structure

ai-infrastructure/
├── clients/           # AI client configurations
│   ├── claude/        # Claude Desktop config
│   ├── cline/         # Cline config
│   └── copilot/       # VS Code Copilot config
├── gateways/          # MCP gateways
│   └── agentgateway/  # Linux Foundation MCP gateway
├── mcps/              # MCP servers
│   ├── browser-use/   # AI browser automation
│   ├── context7/      # Library documentation
│   ├── hass-mcp/      # Home Assistant
│   ├── kapture/       # Chrome extension MCP
│   ├── mcpx/          # MCPX gateway (alternative)
│   ├── memory/        # Memory/knowledge graph
│   ├── playwright/    # Playwright browser automation
│   ├── qdrant-mcp/    # Qdrant semantic search (mcp-proxy + mcp-server-qdrant)
│   ├── sequential-thinking/ # Chain of thought reasoning
│   └── stdio-proxy/   # stdio→SSE bridge
├── platform/          # Platform services
│   ├── context-lens/  # LLM context window inspector
│   ├── langfuse/      # LLM observability, prompts, evals
│   └── observability/ # Prometheus, Grafana, Jaeger
└── workflows/         # Custom workflow definitions

Components

Gateways

Gateway Description Status
agentgateway Linux Foundation MCP gateway with auth, RBAC, rate limiting ✅ Running

MCP Servers

MCP Description Status Docs
sequential-thinking Chain of thought reasoning ✅ Running
memory Knowledge graph & memory ✅ Running
stdio-proxy stdio→SSE bridge (mcp-proxy) ✅ Running
kapture Chrome extension MCP ✅ Running
playwright Browser automation ✅ Running
qdrant-mcp Qdrant semantic search (work, code) ✅ Running
browser-use AI browser automation ✅ Running
context7 Context7 library docs ✅ Running
hass-mcp Home Assistant ✅ Running

Platform Services

Service Description Status Docs
Observability Prometheus, Grafana, Jaeger ✅ Running
Langfuse LLM observability, prompts, evals ✅ Available
Context Lens LLM context window inspector (proxy + web UI) ✅ Running

Clients

See clients/readme.md for configuration.

Client Config
VS Code Copilot copilot/
Claude Desktop claude/
Cline cline/

Quick Start

1. Create shared Docker network

docker network create ai-shared

Note: The ai-shared network is created by the mcpx stack automatically. This step is only needed if you start other services before mcpx.

2. Start mcpx gateway (creates ai-shared network)

cd mcps/mcpx
docker-compose up -d

3. Start agentgateway

cd gateways/agentgateway
docker-compose up -d

4. Start observability stack (optional)

cd platform/observability
docker-compose up -d

5. Start Langfuse (optional)

cd platform/langfuse
docker compose up -d

6. Access

  • MCP Endpoint: http://localhost:3847/mcp
  • Admin UI: http://localhost:15001/ui
  • Grafana: http://localhost:3000 (admin/admin)
  • Langfuse: http://localhost:3100 (create account on first visit)

8. Configure your AI client

See clients/ for configuration examples for each AI client.

Ports

Port Service Protocol Notes
3847 agentgateway MCP HTTP Main MCP endpoint
15001 agentgateway Admin UI HTTP Playground & config
15020 agentgateway Metrics Prometheus Scraped by Prometheus
3443 nginx-proxy HTTPS HTTPS TLS termination
9223 nginx-proxy CDP CDP Proxies to host Chrome :9222
7030 stdio-proxy SSE Bridges stdio MCPs
61822 Kapture WebSocket WebSocket Chrome extension
16686 Jaeger UI HTTP Trace visualization
9090 Prometheus HTTP Metrics UI & API
3000 Grafana HTTP Dashboards (admin/admin)
3100 Langfuse HTTP LLM observability UI
6333 Qdrant HTTP API HTTP Vector DB REST API
6334 Qdrant gRPC API gRPC Vector DB gRPC API
7020 qdrant-mcp SSE Semantic search MCP (mcp-proxy)
9190 Langfuse MinIO HTTP S3-compatible storage
4040 Context Lens Proxy HTTP LLM API interception proxy
4041 Context Lens UI HTTP Context composition web UI
4317/4318 OTel Collector gRPC/HTTP Internal only (Docker network)
8889 OTel Collector Metrics Prometheus Span metrics (internal)

Observability

The observability stack provides metrics, tracing, and visualization:

Component Port Purpose
agentgateway Admin UI :15001 Admin UI with playground
agentgateway Metrics :15020 Prometheus metrics endpoint
Prometheus :9090 Metrics storage and queries
Grafana :3000 Dashboards (admin/admin)
Jaeger :16686 Distributed tracing
Langfuse :3100 LLM observability & prompts
OpenTelemetry Collector :4317/:4318 (internal) Trace processing & span metrics

Trace Flow:

agentgateway → OTel Collector → Jaeger (traces)
                             → Prometheus (span metrics)

Metrics include:

  • agentgateway_requests_total - HTTP requests by client, method, status
  • agentgateway_mcp_requests - MCP tool calls
  • tool_calls_total - Tool calls by server and tool name
  • list_calls_total - List operations
  • Span-derived metrics (latency histograms, call counts) from OTel Collector

TODO

  • Fix langfuse-prompts MCP backend - Langfuse stack (:3101) not running; agentgateway fails to initialize when this upstream is unreachable. Need to either ensure langfuse starts with the gateway or handle gracefully.
  • Fix obsidian MCP backend - Obsidian semantic plugin (:3001) not running; same issue as langfuse. Need to either auto-start or make the gateway tolerant of missing optional backends.
  • Evaluate using agentgateway's native TLS instead of nginx-proxy for HTTPS termination
  • Configure Playwright MCP with CDP proxy (nginx-proxy on 9223 needed for browser-use MCPs to connect to host Chrome)
  • Recreate remaining containers with Watchtower labels (agentgateway, playwright, qdrant, observability stack) — labels added to compose files but containers need docker compose up -d --force-recreate to pick them up
  • Add client identification headers for per-client tracking
  • Set up Jaeger for distributed tracing
  • Configure agentgateway to send traces to OpenTelemetry Collector
  • Create Grafana dashboard for metrics visualization
  • Set up Langfuse for LLM observability and prompt management

Workflows

Fork Contribution: Cherry-Pick Staged Changes

Used for contributing to upstream open-source projects (e.g., Context Lens). Maintains a local main with all in-flight fixes applied while each fix lives on its own branch as a separate PR to upstream.

upstream/main ← PRs from your fork branches
    ↑
origin/main (your fork, tracks upstream)
    ↑
local main (staged cherry-picks from all active branches)
    ↑
┌───┴───┬──────────┬──────────┐
fix/a   fix/b   feat/c   fix/d    ← worktree branches, each = 1 PR

Key invariant: Local main is never committed ahead of origin/main. All local-only changes live as staged but uncommitted cherry-picks.

Develop on worktree branches (each branch = one PR):

git worktree add ../repo-fix-foo fix/foo
cd ../repo-fix-foo
# ... make changes, commit, push ...
git push origin fix/foo   # open PR against upstream/main

Stack changes on local main:

git checkout main
git cherry-pick --no-commit origin/main..<branch-name>
# Repeat for each active branch — all fixes are now applied but uncommitted

Sync with upstream (chain: upstream → origin/main → branches):

git stash push -m "staged cherry-picks"
git checkout main && git fetch --all
git rebase upstream/main
git push origin main                    # --force-with-lease if rebased

# Rebase active branches onto synced main
git rebase main fix/still-open-a
git push --force-with-lease origin fix/still-open-a

# Rebuild staged state from remaining open branches
git checkout main
git cherry-pick --no-commit origin/main..fix/still-open-a
git stash drop

Switch machines — clone fork, fetch, cherry-pick open branches:

git clone <fork-url> && cd repo
git remote add upstream <upstream-url>
git fetch --all && git rebase upstream/main
git cherry-pick --no-commit origin/main..origin/fix/branch-a
git cherry-pick --no-commit origin/main..origin/fix/branch-b

Tips:

  • git branch --no-merged origin/main — list branches that still need cherry-picking
  • git diff --cached --stat — see your current cherry-pick stack
  • git restore --staged . — abort and rebuild if staged state gets messy

See Context Lens workflow details for project-specific branch status and machine resume recipes.

Resources