Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -22,4 +22,5 @@ build/
venv/
.venv/
__pyo3*.so
*.egg-info/
*.egg-info/
.claude/
148 changes: 148 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,148 @@
# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Project Overview

Attach Gateway is a Python-based **OIDC/DID identity sidecar** for LLM engines (Ollama, vLLM) and multi-agent frameworks. It provides OIDC/DID-JWT authentication as the core, with optional A2A handoff, pluggable memory backends, and usage/quota features that can be enabled when needed.

## Design / Code Philosophy

**Attach is an identity sidecar first.** Many users install and run Attach only to enforce OIDC/DID auth and stamp identity/session headers in front of an LLM engine. That "OIDC sidecar" path must remain:
- fast to start
- low overhead
- stable and backwards compatible

**Opt-in, not mandatory.** Any feature beyond OIDC/DID auth (memory backends, A2A routing, quotas, metering, MCP gateway, etc.) must be:
- gated behind explicit flags/config (env vars, config files, or CLI subcommands)
- disabled by default
- "lazy-loaded" (avoid importing heavy modules or starting background tasks unless the feature is enabled)

**No surprise dependencies.**
- Keep the base install lean.
- Add heavyweight or optional integrations as extras (e.g. `.[quota]`, `.[usage]`, `.[full]`) and ensure the default path does not require them.
- Missing optional env vars must not crash the gateway; prefer graceful fallbacks with clear warnings.

**Local-first and privacy-respecting by default.**
- No phone-home behavior unless explicitly enabled.
- Default logs/metrics should be local-only. If remote metering is supported, it must be opt-in and non-fatal.

**Safe-by-default changes.**
- New routes/middlewares must not weaken authentication requirements for existing endpoints.
- Avoid breaking changes to required environment variables or the default startup flow.

## Build and Development Commands

```bash
# Install from source with all dev dependencies
pip install -e ".[dev,full]"

# Run the gateway (development)
uvicorn main:app --port 8080 --reload

# Run the gateway (CLI)
attach-gateway --port 8080

# Format code
black .
isort .

# Run all tests
pytest tests/

# Run specific test file
pytest tests/test_jwt_middleware.py -v

# Run tests with coverage
pytest tests/ --cov=.
```

## Required Environment Variables

```bash
OIDC_ISSUER=https://your-domain.auth0.com/ # OIDC provider issuer URL
OIDC_AUD=your-api-identifier # Expected JWT audience claim
```

Optional variables:
- `ENGINE_URL`: LLM engine endpoint (default: `http://localhost:11434`)
- `MEM_BACKEND`: Memory backend - `none` (default), `weaviate`, or `sakana`
- `WEAVIATE_URL`: Required if `MEM_BACKEND=weaviate`
- `MAX_TOKENS_PER_MIN`: Enables token quota middleware
- `USAGE_METERING`: `null` (default), `prometheus`, or `openmeter`
- Note: metering backends must remain optional; missing keys/config should gracefully fall back to `null` behavior.

## Architecture

### Request Flow
```
Client Request → middleware/auth.py (JWT validation)
→ middleware/session.py (session ID generation)
→ proxy/engine.py or a2a/routes.py
→ Memory backend (fire-and-forget write)
```

### Key Modules
- **auth/**: OIDC JWT and DID token verification (`oidc.py`, `did.py`)
- **middleware/**: Stateless header processing - auth extraction, session stamping, quota enforcement
- **proxy/**: Engine-agnostic HTTP streaming proxy to Ollama/vLLM
- **a2a/**: Agent-to-agent task routing (`/a2a/tasks/send`, `/a2a/tasks/status`)
- **mem/**: Pluggable memory backends with factory pattern
- **usage/**: Token metering backends (Prometheus, OpenMeter)

### Authentication Dispatch
`auth/__init__.py` routes tokens by format:
- 2 dots → OIDC JWT (`auth/oidc.py`) - RS256/ES256 only
- 3+ dots → DID token (`auth/did.py`) - did:key or did:pkh

## Code Conventions

- Python 3.10+ with type hints everywhere using `from __future__ import annotations`
- All FastAPI routes must be `async`; wrap blocking I/O in `loop.run_in_executor`
- Use `aiter_bytes()` for streaming responses (constant memory)
- Module size limit: 400 lines; extract helpers if larger
- Format with `black`, sort imports with `isort`
- Commit messages: Conventional Commits (`feat:`, `fix:`, `docs:`)

## Security Requirements

- Reject HS256 JWTs; only accept RS256/ES256
- Enforce `aud` and `exp` claims (60s clock skew allowed)
- Session IDs: `sha256(user.sub + user-agent)` - non-guessable
- Log only first 8 chars of JWT `sub` claim; never log full tokens

## Testing

- Use `pytest-asyncio` for async tests
- Mock network calls with `httpx.MockTransport`
- Test files in `tests/` directory
- Config: `pytest.ini` sets `pythonpath = .`

## Starting Local Services

```bash
# Start Weaviate memory backend
docker run --rm -d -p 6666:8080 \
-e AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED=true \
semitechnologies/weaviate:1.30.5

# Or use the helper script
python script/start_weaviate.py
```

## Multi-Agent Demo

```bash
# Terminal 1: Gateway
uvicorn main:app --port 8080

# Terminal 2: Planner agent
uvicorn examples.agents.planner:app --port 8100

# Terminal 3: Coder agent
uvicorn examples.agents.coder:app --port 8101

# Terminal 4: Demo UI
cd examples/static && python -m http.server 9000
# Open http://localhost:9000/demo.html
```
111 changes: 111 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -100,6 +100,117 @@ You should see a JSON response plus `X‑ATTACH‑Session‑Id` header – proof

---

## Claude Code + MCP Gateway (Local-First, 2-Minute Setup)

Attach Gateway can act as a **local MCP (Model Context Protocol) reverse proxy** for Claude Code, providing:
- JWT-authenticated access control for all MCP tool calls
- Per-user daily quota enforcement (configurable glob patterns)
- Local audit logs (SQLite) for compliance & debugging
- Web-based console UI for monitoring

This feature is **opt-in** and does not affect the core OIDC sidecar functionality.

### Quick Setup

```bash
# 1. Install Attach Gateway
pip install attach-dev

# 2. Configure your MCP servers (HTTP upstream only in MVP)
mkdir -p ~/.attach
cat > ~/.attach/mcp.json <<EOF
{
"version": 1,
"servers": {
"notion": {
"enabled": true,
"url": "http://localhost:7001/mcp",
"headers": {
"Authorization": "env:NOTION_TOKEN"
}
}
}
}
EOF

# 3. Optional: Configure quotas
cat > ~/.attach/mcp_policy.json <<EOF
{
"version": 1,
"enabled": true,
"per_user_daily_tool_calls": {
"notion.*": 100,
"github.*": 200,
"*": 1000
}
}
EOF

# 4. Start the gateway with MCP enabled
export OIDC_ISSUER=https://YOUR_DOMAIN.auth0.com
export OIDC_AUD=your-api-identifier
export ATTACH_ENABLE_MCP=true

attach-gateway --port 8080
```

### Integrate with Claude Code

```bash
# Generate Claude Code configuration commands
attach-gateway claude install --project .

# Or manually add servers (using positional args):
claude mcp add --transport http notion http://localhost:8080/mcp/notion

# If your Claude Code version supports authorization headers:
# export JWT=<your-bearer-token>
# claude mcp add --transport http notion http://localhost:8080/mcp/notion --header "Authorization: Bearer $JWT"
```

### Use the Console UI

1. Get a JWT token from your OIDC provider
2. Open http://localhost:8080/console
3. Paste your Bearer token when prompted
4. View MCP call statistics, audit logs, and server status

### MCP CLI Commands

```bash
# List configured servers
attach-gateway mcp list

# Add a new server
attach-gateway mcp add github http://localhost:7002/mcp \
--header "Authorization: env:GITHUB_TOKEN"

# Enable/disable servers
attach-gateway mcp enable github
attach-gateway mcp disable github

# Remove a server
attach-gateway mcp remove github
```

### How It Works

1. Claude Code sends JSON-RPC requests to `http://localhost:8080/mcp/{server}`
2. Gateway validates your JWT Bearer token
3. Gateway checks quota limits (if enabled)
4. Gateway forwards request to configured upstream MCP server
5. Gateway logs metadata (timestamp, user, tool, latency) to local SQLite
6. Response is returned to Claude Code

**Security Notes:**
- `/mcp/*` endpoints require Bearer JWT authentication
- Console UI data endpoints (`/console/api/*`) require JWT
- Console landing page and static assets are unauthenticated (no sensitive data)
- Audit logs contain metadata only (no request/response bodies)
- All data stays local by default (no phone-home)

---

## Use in your project

1. Copy `.env.example` → `.env` and fill in OIDC + backend URLs
Expand Down
Loading