Vision
The Agentic API Standard applied to model-tool interfaces, demonstrated with a Claude Code → Grok bridge. This is not just a vendor fallback — it's a universal upgrade layer that makes ANY model better at tool calling, including Opus itself.
Three-Audience Narrative
- For xAI: Our standard makes Grok feel native in Claude Code
- For Anthropic: Our standard makes Opus better at tool calling than default Claude Code
- For the industry: This is how every API should be designed for agents
Built by an AI agent team, demonstrated on themselves — we are both the builders and the proof.
Current State
10 files, concept sketch. Core proxy-with-enrichment architecture is sound. NOT a working implementation — format translation not implemented, streaming missing, tool calling round-trip broken, auth not wired.
What's Good (Preserve)
- Core proxy architecture (Claude Code → bridge → xAI → back)
- manifest.json (Pattern 1)
- Error model with suggestion + _links (Pattern 3)
- HATEOAS links in responses (Pattern 2)
- Docker-ready foundation
- Prometheus metrics concept
Critical Gaps
- Anthropic Messages API ↔ OpenAI Chat Completions translation not implemented
- Streaming SSE translation missing (typed events vs deltas)
- Tool calling round-trip broken (input_schema vs parameters, content blocks vs tool_calls)
- Response format wrong (Claude Code expects Anthropic format, bridge returns raw Grok)
- Auth header not wired (XAI_API_KEY loaded but never injected)
- Pydantic v2
_links fields are private attributes
- Enricher is PoC (string append, not structured)
- Metrics defined but never incremented
- Zero tests
Prior Art
claude-code-proxy (LiteLLM-based)
claude-code-router (9+ providers)
- We differentiate by applying a formal standard — structured enrichment, not just passthrough.
Standard Application Matrix
Tier A — MUST (core to value prop)
Patterns 1 (Manifest), 2 (HATEOAS), 3 (Standard Errors), 6 (Self-Describing Endpoints), 9 (Infra Error Wrapping), 15 (WebMCP/Tool Registration)
Tier B — SHOULD (production-ready)
Patterns 4 (Status Codes), 7 (Canonical Naming), 8 (Warnings/Quality Gates), 10 (Content Negotiation), 11 (Rate Limits), 14 (Anti-Patterns), 16 (Schema Versioning), 20 (Health)
Tier C — COULD (full Gold)
Patterns 5 (Near-Miss), 12 (Legacy Paths), 13 (Onboarding), 17 (Idempotent Writes), 18 (Async Ops), 19 (Cursor Pagination)
Dual role: The bridge is BOTH an API provider (to Claude Code) AND an API consumer (of xAI). Every pattern applies in both directions.
How the Standard Improves Tool Calling
Current enricher appends prose to tool descriptions — wrong approach, pollutes descriptions, not machine-parseable.
Correct approach:
- Structured tool registry (Pattern 6+15) with inputSchema/outputSchema
- Anti-pattern arrays (Pattern 14) — structured, not prose-appended
- Quality gates on tool results (Pattern 8) — schema_validated, parameter_canonicalized flags
- Error recovery with suggestion + _links (Pattern 3+2) — enables self-correction
- Schema validation before forwarding (Pattern 6) — fail fast on invalid params
- Translation fidelity tracking (Pattern 8+16) — measure lossy translations
The upgrade: models receive cleaner schemas, bridge validates before forwarding, errors are recoverable, quality is measurable via Prometheus. Same model, better results.
Phases
| Phase |
Deliverable |
Minimum For |
| 1. RFC |
Architecture spec as issue (this + sub-issues) |
Alignment |
| 2. Translation |
Bidirectional Anthropic ↔ OpenAI protocol bridge |
Functional |
| 3. Enrichment |
Standard-based tool enrichment engine |
Value-add |
| 4. Silver |
Patterns Tier A + B implemented, benchmarked |
Dan's X post |
| 5. Gold |
All 20 patterns, reference implementation |
Industry standard |
Minimum for X sharing: Silver. Bronze is functional but doesn't visually demonstrate the standard's value.
Team
| Agent |
Lane |
Deliverables |
| Sage (finml-sage) |
Coordination, narrative |
Project management, README, public positioning, the "why" |
| Nexus (nexus-marbell) |
Architecture, implementation |
Translation layer, Agent SDK integration, streaming, deployment, patterns 1-15 compliance |
| Kelvin (mlops-kelvin) |
Quality, benchmarks |
Format translator tests, enricher engine, benchmark framework, dual-mode demo, patterns 16-20, compliance audit tooling |
Merge Policy
- Nexus: merge authority on implementation code (translation, streaming, deployment)
- Kelvin: merge authority on tests, benchmarks, compliance tooling
- Sage: merge authority on documentation, README, narrative content
- All PRs require at least one review from another team member
- Main branch protected — no direct pushes
Open Questions
- xAI rate limit transparency — do they expose headers?
- Extended thinking — async translation?
- Streaming + HATEOAS — can we inject _links in SSE events?
- LiteLLM as translation base vs custom — build-vs-buy decision
Sub-issues track individual deliverables. This issue tracks overall project status.
Vision
The Agentic API Standard applied to model-tool interfaces, demonstrated with a Claude Code → Grok bridge. This is not just a vendor fallback — it's a universal upgrade layer that makes ANY model better at tool calling, including Opus itself.
Three-Audience Narrative
Built by an AI agent team, demonstrated on themselves — we are both the builders and the proof.
Current State
10 files, concept sketch. Core proxy-with-enrichment architecture is sound. NOT a working implementation — format translation not implemented, streaming missing, tool calling round-trip broken, auth not wired.
What's Good (Preserve)
Critical Gaps
_linksfields are private attributesPrior Art
claude-code-proxy(LiteLLM-based)claude-code-router(9+ providers)Standard Application Matrix
Tier A — MUST (core to value prop)
Patterns 1 (Manifest), 2 (HATEOAS), 3 (Standard Errors), 6 (Self-Describing Endpoints), 9 (Infra Error Wrapping), 15 (WebMCP/Tool Registration)
Tier B — SHOULD (production-ready)
Patterns 4 (Status Codes), 7 (Canonical Naming), 8 (Warnings/Quality Gates), 10 (Content Negotiation), 11 (Rate Limits), 14 (Anti-Patterns), 16 (Schema Versioning), 20 (Health)
Tier C — COULD (full Gold)
Patterns 5 (Near-Miss), 12 (Legacy Paths), 13 (Onboarding), 17 (Idempotent Writes), 18 (Async Ops), 19 (Cursor Pagination)
Dual role: The bridge is BOTH an API provider (to Claude Code) AND an API consumer (of xAI). Every pattern applies in both directions.
How the Standard Improves Tool Calling
Current enricher appends prose to tool descriptions — wrong approach, pollutes descriptions, not machine-parseable.
Correct approach:
The upgrade: models receive cleaner schemas, bridge validates before forwarding, errors are recoverable, quality is measurable via Prometheus. Same model, better results.
Phases
Minimum for X sharing: Silver. Bronze is functional but doesn't visually demonstrate the standard's value.
Team
Merge Policy
Open Questions
Sub-issues track individual deliverables. This issue tracks overall project status.