-
Notifications
You must be signed in to change notification settings - Fork 20
Support spec v1.11.0: OpenTelemetry OTLP tracing configuration #3187
Description
Context
gh-aw#24602 was merged, extending the MCP Gateway Specification from v1.10.0 to v1.11.0. It adds an optional opentelemetry configuration object to the gateway config section. When configured, the gateway must emit distributed tracing spans for each MCP tool invocation using OTLP/HTTP.
Related: #3177 — our existing feature issue for OTLP tracing (architecture analysis + solution proposal). This issue focuses specifically on spec compliance with v1.11.0.
What the spec requires (§4.1.3.6)
Config schema (opentelemetry object in gateway)
| Field | Type | Required | Description |
|---|---|---|---|
endpoint |
string | Yes | OTLP/HTTP collector URL. MUST be HTTPS. Supports ${VAR} expansion. |
headers |
object | No | HTTP headers for export requests (e.g., auth tokens). Values support ${VAR}. |
traceId |
string | No | Parent trace ID (32-char lowercase hex, W3C format). Supports ${VAR}. |
spanId |
string | No | Parent span ID (16-char lowercase hex, W3C format). Ignored without traceId. Supports ${VAR}. |
serviceName |
string | No | service.name resource attribute. Default: "mcp-gateway". |
Required tracing behavior
When opentelemetry is configured, the gateway MUST:
- Create a root span for the gateway process lifetime with
service.nameset toserviceName - Create a child span per MCP tool invocation with attributes:
mcp.server— server name from configmcp.method— JSON-RPC method (e.g.,tools/call)mcp.tool— tool namehttp.status_code— HTTP status of proxied response
- Record accurate start/end timestamps
- Export via OTLP/HTTP to the configured endpoint
- Apply configured
headersto every export request - Propagate W3C
traceparentwhentraceId/spanIdprovided
Failure handling
- Gateway MUST NOT fail to start if collector is unreachable
- Export failures SHOULD be logged as warnings, MUST NOT affect MCP processing
- SHOULD implement exponential backoff retry
Validation rules
endpointrequired whenopentelemetrypresentendpointMUST be HTTPStraceIdmust be 32-char lowercase hex (or${VAR})spanIdmust be 16-char lowercase hex (or${VAR})spanIdwithouttraceId→ log warning, ignore
Gap analysis: current gateway vs spec
Existing TracingConfig (from recent work)
type TracingConfig struct {
Endpoint string `toml:"endpoint" json:"endpoint,omitempty"`
ServiceName string `toml:"service_name" json:"service_name,omitempty"`
SampleRate *float64 `toml:"sample_rate" json:"sample_rate,omitempty"`
}What needs to change
| Area | Current State | Required by Spec | Work |
|---|---|---|---|
| Config fields | endpoint, service_name, sample_rate |
Add headers, traceId, spanId |
Add 3 fields to TracingConfig |
| TOML field name | [gateway.tracing] |
Spec uses opentelemetry |
Add TOML alias or rename |
| JSON stdin config | Not wired | opentelemetry in StdinGatewayConfig |
Add StdinOpenTelemetryConfig + conversion |
| Endpoint validation | None | MUST be HTTPS | Add validation in validation.go |
| traceId/spanId validation | N/A | 32/16-char hex regex | Add validation |
| Variable expansion | All fields support ${VAR} via existing expansion |
Spec requires it | Already supported ✅ |
| W3C trace context | Not implemented | Construct traceparent from traceId+spanId |
New: build parent context |
| Headers | Not supported | Pass to OTLP exporter | Thread through exporter config |
| Root span | Not implemented | Process-lifetime root span | New: create at startup |
| Tool call spans | Not implemented | Per-invocation with mcp.* attributes |
New: instrument callBackendTool() |
| OTLP export | Not implemented | OTLP/HTTP to endpoint | New: init TracerProvider with OTLP exporter |
| Failure isolation | Not implemented | Export errors must not affect MCP | Use batch processor + noop fallback |
| SampleRate | In config | Not in spec (but harmless) | Keep as extension field |
| Compliance tests | None | T-OTEL-001 through T-OTEL-010 | 10 new tests |
Proposed implementation plan
1. Config alignment (~80 lines)
Update TracingConfig to match spec §4.1.3.6:
type TracingConfig struct {
Endpoint string `toml:"endpoint" json:"endpoint,omitempty"`
Headers map[string]string `toml:"headers" json:"headers,omitempty"`
TraceID string `toml:"trace_id" json:"traceId,omitempty"`
SpanID string `toml:"span_id" json:"spanId,omitempty"`
ServiceName string `toml:"service_name" json:"serviceName,omitempty"`
SampleRate *float64 `toml:"sample_rate" json:"sampleRate,omitempty"`
}Add opentelemetry alias in TOML (the spec uses opentelemetry, our current config uses tracing). Support both for backward compatibility.
Wire into StdinGatewayConfig + convertStdinConfig().
2. Validation (~60 lines)
In validation.go:
endpointmust be HTTPS (when present)traceIdmust match^[0-9a-f]{32}$(after variable expansion)spanIdmust match^[0-9a-f]{16}$(after variable expansion)spanIdwithouttraceId→ warningendpointrequired whenopentelemetryobject is present
3. OTLP exporter + TracerProvider (~100 lines)
New internal/tracing/ package:
InitTracerProvider(cfg *config.TracingConfig)→ returns*sdktrace.TracerProvideror noop- Uses OTLP/HTTP exporter with configured endpoint + headers
- Batch span processor (built-in retry + backoff)
- Constructs W3C parent context from
traceId+spanIdif provided - Graceful shutdown via
TracerProvider.Shutdown(ctx)
4. Instrumentation (~80 lines)
In callBackendTool() (unified.go):
ctx, span := tracer.Start(ctx, "mcp.tool_call",
trace.WithAttributes(
attribute.String("mcp.server", serverID),
attribute.String("mcp.method", "tools/call"),
attribute.String("mcp.tool", toolName),
))
defer span.End()Root span created at startup in cmd/root.go.
5. Compliance tests (~200 lines)
T-OTEL-001 through T-OTEL-010 as described in spec §10.1.10.
New dependencies
go.opentelemetry.io/otel
go.opentelemetry.io/otel/sdk
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp
Compliance test mapping
| Test ID | Description | Type |
|---|---|---|
| T-OTEL-001 | Gateway starts when opentelemetry omitted |
Config |
| T-OTEL-002 | Gateway starts with valid endpoint | Config |
| T-OTEL-003 | Reject missing endpoint |
Validation |
| T-OTEL-004 | Reject non-HTTPS endpoint | Validation |
| T-OTEL-005 | Span per tool call with required attributes | Integration |
| T-OTEL-006 | Headers sent with OTLP export | Integration |
| T-OTEL-007 | W3C traceparent with traceId + spanId | Integration |
| T-OTEL-008 | Random spanId when only traceId provided | Integration |
| T-OTEL-009 | Export failure doesn't affect MCP | Resilience |
| T-OTEL-010 | serviceName in service.name attribute | Integration |
References
- Spec PR: gh-aw#24602
- Architecture analysis: #3177
- MCP Gateway Spec §4.1.3.6
- W3C Trace Context
- OTLP/HTTP Specification