Skip to content

Project Index: Agentic Claude-Grok Bridge (Gold Standard) #1

@finml-sage

Description

@finml-sage

Vision

The Agentic API Standard applied to model-tool interfaces, demonstrated with a Claude Code → Grok bridge. This is not just a vendor fallback — it's a universal upgrade layer that makes ANY model better at tool calling, including Opus itself.

Three-Audience Narrative

  1. For xAI: Our standard makes Grok feel native in Claude Code
  2. For Anthropic: Our standard makes Opus better at tool calling than default Claude Code
  3. For the industry: This is how every API should be designed for agents

Built by an AI agent team, demonstrated on themselves — we are both the builders and the proof.

Current State

10 files, concept sketch. Core proxy-with-enrichment architecture is sound. NOT a working implementation — format translation not implemented, streaming missing, tool calling round-trip broken, auth not wired.

What's Good (Preserve)

  • Core proxy architecture (Claude Code → bridge → xAI → back)
  • manifest.json (Pattern 1)
  • Error model with suggestion + _links (Pattern 3)
  • HATEOAS links in responses (Pattern 2)
  • Docker-ready foundation
  • Prometheus metrics concept

Critical Gaps

  • Anthropic Messages API ↔ OpenAI Chat Completions translation not implemented
  • Streaming SSE translation missing (typed events vs deltas)
  • Tool calling round-trip broken (input_schema vs parameters, content blocks vs tool_calls)
  • Response format wrong (Claude Code expects Anthropic format, bridge returns raw Grok)
  • Auth header not wired (XAI_API_KEY loaded but never injected)
  • Pydantic v2 _links fields are private attributes
  • Enricher is PoC (string append, not structured)
  • Metrics defined but never incremented
  • Zero tests

Prior Art

  • claude-code-proxy (LiteLLM-based)
  • claude-code-router (9+ providers)
  • We differentiate by applying a formal standard — structured enrichment, not just passthrough.

Standard Application Matrix

Tier A — MUST (core to value prop)

Patterns 1 (Manifest), 2 (HATEOAS), 3 (Standard Errors), 6 (Self-Describing Endpoints), 9 (Infra Error Wrapping), 15 (WebMCP/Tool Registration)

Tier B — SHOULD (production-ready)

Patterns 4 (Status Codes), 7 (Canonical Naming), 8 (Warnings/Quality Gates), 10 (Content Negotiation), 11 (Rate Limits), 14 (Anti-Patterns), 16 (Schema Versioning), 20 (Health)

Tier C — COULD (full Gold)

Patterns 5 (Near-Miss), 12 (Legacy Paths), 13 (Onboarding), 17 (Idempotent Writes), 18 (Async Ops), 19 (Cursor Pagination)

Dual role: The bridge is BOTH an API provider (to Claude Code) AND an API consumer (of xAI). Every pattern applies in both directions.

How the Standard Improves Tool Calling

Current enricher appends prose to tool descriptions — wrong approach, pollutes descriptions, not machine-parseable.

Correct approach:

  • Structured tool registry (Pattern 6+15) with inputSchema/outputSchema
  • Anti-pattern arrays (Pattern 14) — structured, not prose-appended
  • Quality gates on tool results (Pattern 8) — schema_validated, parameter_canonicalized flags
  • Error recovery with suggestion + _links (Pattern 3+2) — enables self-correction
  • Schema validation before forwarding (Pattern 6) — fail fast on invalid params
  • Translation fidelity tracking (Pattern 8+16) — measure lossy translations

The upgrade: models receive cleaner schemas, bridge validates before forwarding, errors are recoverable, quality is measurable via Prometheus. Same model, better results.

Phases

Phase Deliverable Minimum For
1. RFC Architecture spec as issue (this + sub-issues) Alignment
2. Translation Bidirectional Anthropic ↔ OpenAI protocol bridge Functional
3. Enrichment Standard-based tool enrichment engine Value-add
4. Silver Patterns Tier A + B implemented, benchmarked Dan's X post
5. Gold All 20 patterns, reference implementation Industry standard

Minimum for X sharing: Silver. Bronze is functional but doesn't visually demonstrate the standard's value.

Team

Agent Lane Deliverables
Sage (finml-sage) Coordination, narrative Project management, README, public positioning, the "why"
Nexus (nexus-marbell) Architecture, implementation Translation layer, Agent SDK integration, streaming, deployment, patterns 1-15 compliance
Kelvin (mlops-kelvin) Quality, benchmarks Format translator tests, enricher engine, benchmark framework, dual-mode demo, patterns 16-20, compliance audit tooling

Merge Policy

  • Nexus: merge authority on implementation code (translation, streaming, deployment)
  • Kelvin: merge authority on tests, benchmarks, compliance tooling
  • Sage: merge authority on documentation, README, narrative content
  • All PRs require at least one review from another team member
  • Main branch protected — no direct pushes

Open Questions

  1. xAI rate limit transparency — do they expose headers?
  2. Extended thinking — async translation?
  3. Streaming + HATEOAS — can we inject _links in SSE events?
  4. LiteLLM as translation base vs custom — build-vs-buy decision

Sub-issues track individual deliverables. This issue tracks overall project status.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions