diff --git a/ci/vale/styles/config/vocabularies/nat/accept.txt b/ci/vale/styles/config/vocabularies/nat/accept.txt index 05c9bb5c95..e67d6c1d0c 100644 --- a/ci/vale/styles/config/vocabularies/nat/accept.txt +++ b/ci/vale/styles/config/vocabularies/nat/accept.txt @@ -20,6 +20,7 @@ Authlib [Bb]ackpressure [Bb]atcher [Bb]oolean +Braintrust [Bb]rev [Cc]allable(s?) # Documentation for ccache only capitalizes the name at the start of a sentence https://ccache.dev/ diff --git a/docs/source/_static/braintrust-trace.png b/docs/source/_static/braintrust-trace.png new file mode 100644 index 0000000000..59724848c5 Binary files /dev/null and b/docs/source/_static/braintrust-trace.png differ diff --git a/docs/source/run-workflows/observe/observe-workflow-with-braintrust.md b/docs/source/run-workflows/observe/observe-workflow-with-braintrust.md new file mode 100644 index 0000000000..4ca41073fb --- /dev/null +++ b/docs/source/run-workflows/observe/observe-workflow-with-braintrust.md @@ -0,0 +1,126 @@ + + +# Observing a Workflow with Braintrust + +This guide provides a step-by-step process to enable observability in a NeMo Agent Toolkit workflow using Braintrust for tracing. By the end of this guide, you will have: + +- Configured telemetry in your workflow. +- Ability to view traces in the Braintrust platform. + +## Step 1: Create a Braintrust Account + +1. Visit [https://www.braintrust.dev](https://www.braintrust.dev) and sign up for an account. +2. Once logged in, navigate to your organization settings to generate an API key. + +## Step 2: Create a Project + +Create a new project in Braintrust to organize your traces: + +1. Navigate to the Braintrust dashboard. +2. Click on **Projects** in the sidebar. +3. Click **+ New Project**. +4. Name your project (e.g., `nat-calculator`). +5. Note down the **Project Name** for configuration. + +## Step 3: Configure Your Environment + +Set the following environment variables in your terminal: + +```bash +export BRAINTRUST_API_KEY= +``` + +Alternatively, you can provide the API key directly in your workflow configuration. + +## Step 4: Install the NeMo Agent Toolkit OpenTelemetry Subpackages + +```bash +# Install specific telemetry extras required for Braintrust +uv pip install -e '.[opentelemetry]' +``` + +## Step 5: Modify NeMo Agent Toolkit Workflow Configuration + +Update your workflow configuration file to include the telemetry settings. + +Example configuration: +```yaml +general: + telemetry: + tracing: + braintrust: + _type: braintrust + project: nat-calculator + # Optional: Override the default endpoint for self-hosted deployments + # endpoint: https://api.braintrust.dev/otel/v1/traces +``` + +You can also specify the API key directly in the configuration: +```yaml +general: + telemetry: + tracing: + braintrust: + _type: braintrust + project: nat-calculator + api_key: ${BRAINTRUST_API_KEY} +``` + +## Step 6: Run the workflow + +From the root directory of the NeMo Agent Toolkit library, install dependencies and run the pre-configured `simple_calculator_observability` example. + +**Example:** + +```bash +# Install the workflow and plugins +uv pip install -e examples/observability/simple_calculator_observability/ + +# Run the workflow with Braintrust telemetry settings +nat run --config_file examples/observability/simple_calculator_observability/configs/config-braintrust.yml --input "What is 1*2?" +``` + +As the workflow runs, telemetry data will start showing up in Braintrust. + +## Step 7: Analyze Traces Data in Braintrust + +Analyze the traces in Braintrust: + +1. Navigate to [https://www.braintrust.dev](https://www.braintrust.dev) and log in. +2. Go to **Projects** > **nat-calculator** (or your project name). +3. Click on **Logs** to view your traces. +4. Select any trace to view detailed span information, inputs, outputs, and timing data. + +```{figure} /_static/braintrust-trace.png +:alt: Braintrust Trace View +:align: center + +Example trace view in Braintrust showing workflow spans, inputs, outputs, and timing data. +``` + +## Additional Features + +Braintrust provides additional observability features beyond basic tracing: + +- **Evaluation**: Run automated evaluations on your AI outputs with built-in and custom scorers. +- **Experiments**: Compare different model configurations and prompt variations. +- **Datasets**: Curate golden datasets from your production traces. +- **Prompt Management**: Version and deploy prompts with A/B testing capabilities. +- **Human Review**: Set up review queues for team-based quality analysis. + +For additional help, see the [Braintrust documentation](https://www.braintrust.dev/docs). diff --git a/docs/source/run-workflows/observe/observe.md b/docs/source/run-workflows/observe/observe.md index 5ff89a0773..f025853637 100644 --- a/docs/source/run-workflows/observe/observe.md +++ b/docs/source/run-workflows/observe/observe.md @@ -67,6 +67,7 @@ The following table lists each exporter with its supported features and configur | Provider | Integration Documentation | Supported Features | | -------- | ------------------------- | ------------------ | +| [Braintrust](https://www.braintrust.dev/) | [Observing with Braintrust](?provider=Braintrust#provider-integration-guides){.external} | Logging, Tracing, Evaluation, Human Review | | [Catalyst](https://catalyst.raga.ai/) | [Observing with Catalyst](?provider=Catalyst#provider-integration-guides){.external} | Logging, Tracing | | [NVIDIA Data Flywheel Blueprint](https://build.nvidia.com/nvidia/build-an-enterprise-data-flywheel) | [Observing with Data Flywheel](?provider=Data-Flywheel#provider-integration-guides){.external} | Logging, Tracing | | [DBNL](https://distributional.com/) | [Observing with DBNL](?provider=DBNL#provider-integration-guides){.external} | Logging, Tracing | @@ -189,6 +190,13 @@ For complete information about developing and integrating custom telemetry expor ::::{tab-set} :sync-group: provider + :::{tab-item} Braintrust + :sync: Braintrust + + :::{include} ./observe-workflow-with-braintrust.md + + ::: + :::{tab-item} Catalyst :sync: Catalyst diff --git a/examples/observability/simple_calculator_observability/configs/config-braintrust.yml b/examples/observability/simple_calculator_observability/configs/config-braintrust.yml new file mode 100644 index 0000000000..f3d5cb9ed0 --- /dev/null +++ b/examples/observability/simple_calculator_observability/configs/config-braintrust.yml @@ -0,0 +1,47 @@ +# SPDX-FileCopyrightText: Copyright (c) 2024-2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +general: + telemetry: + logging: + console: + _type: console + level: WARN + tracing: + braintrust: + _type: braintrust + project: nat-calculator + +function_groups: + calculator: + _type: calculator + +functions: + current_datetime: + _type: current_datetime + +llms: + openai_llm: + _type: openai + model_name: gpt-4o-mini + temperature: 0.0 + +workflow: + _type: react_agent + tool_names: [calculator, current_datetime] + llm_name: openai_llm + verbose: true + parse_agent_response_max_retries: 3 diff --git a/packages/nvidia_nat_opentelemetry/src/nat/plugins/opentelemetry/register.py b/packages/nvidia_nat_opentelemetry/src/nat/plugins/opentelemetry/register.py index c385cf6d15..9e06b47253 100644 --- a/packages/nvidia_nat_opentelemetry/src/nat/plugins/opentelemetry/register.py +++ b/packages/nvidia_nat_opentelemetry/src/nat/plugins/opentelemetry/register.py @@ -13,6 +13,7 @@ # See the License for the specific language governing permissions and # limitations under the License. +import json import logging import os @@ -242,3 +243,173 @@ async def dbnl_telemetry_exporter(config: DBNLTelemetryExporter, builder: Builde drop_on_overflow=config.drop_on_overflow, shutdown_timeout=config.shutdown_timeout, ) + + +class BraintrustTelemetryExporter(BatchConfigMixin, CollectorConfigMixin, TelemetryExporterBaseConfig, name="braintrust"): + """A telemetry exporter to transmit traces to Braintrust for AI observability and evaluation.""" + + endpoint: str = Field( + description="The Braintrust OTEL endpoint", + default="https://api.braintrust.dev/otel/v1/traces", + ) + api_key: SerializableSecretStr = Field(description="The Braintrust API key", + default_factory=lambda: SerializableSecretStr("")) + resource_attributes: dict[str, str] = Field(default_factory=dict, + description="The resource attributes to add to the span") + + +# Attribute mappings from OpenInference to Braintrust GenAI semantic conventions +BRAINTRUST_ATTRIBUTE_MAPPINGS = { + "input.value": "gen_ai.prompt", + "output.value": "gen_ai.completion", + "llm.token_count.prompt": "gen_ai.usage.prompt_tokens", + "llm.token_count.completion": "gen_ai.usage.completion_tokens", + "llm.token_count.total": "gen_ai.usage.total_tokens", + "llm.model_name": "gen_ai.request.model", +} + +# Attributes to remove after mapping (redundant - captured elsewhere in Braintrust schema) +# These are removed to reduce metadata clutter while preserving all information: +# - input.value/output.value -> extracted to Braintrust input/output fields via gen_ai.* +# - MIME types -> not needed for Braintrust display +# - nat.event_timestamp -> captured in metrics.start/metrics.end +# - nat.span.kind/openinference.span.kind -> mapped to span_attributes.type +# - llm.token_count.* -> mapped to gen_ai.usage.* +# - nat.metadata.mime_type -> not needed if nat.metadata exists +BRAINTRUST_REDUNDANT_ATTRIBUTES = { + "input.value", + "output.value", + "input.mime_type", + "output.mime_type", + "nat.event_timestamp", + "nat.span.kind", + "openinference.span.kind", + "llm.token_count.prompt", + "llm.token_count.completion", + "llm.token_count.total", + "llm.model_name", + "nat.metadata.mime_type", +} + +# Map OpenInference span kinds to Braintrust span types +# See: https://www.braintrust.dev/docs/reference/span-types +OPENINFERENCE_TO_BRAINTRUST_TYPE = { + "LLM": "llm", + "CHAIN": "task", + "TOOL": "tool", + "AGENT": "task", + "EMBEDDING": "task", + "RETRIEVER": "task", + "RERANKER": "task", + "GUARDRAIL": "task", + "EVALUATOR": "score", + "UNKNOWN": "task", +} + + +def _get_improved_span_name(span_name: str, attrs: dict) -> str: + """Generate an improved span name for better display in Braintrust UI. + + Args: + span_name: The original span name. + attrs: The span attributes dictionary. + + Returns: + An improved span name for display. + """ + # Handle the generic name + if span_name == "": + # Try to get a more descriptive name from attributes + function_name = attrs.get("nat.function.name") + if function_name and function_name != "" and function_name != "root": + return function_name + + # Use event type to create a descriptive name + event_type = attrs.get("nat.event_type", "") + if "WORKFLOW" in event_type: + return "Workflow" + elif "FUNCTION" in event_type: + return "Function" + elif "AGENT" in event_type: + return "Agent" + + # Fall back to span kind if available + span_kind = attrs.get("nat.span.kind") or attrs.get("openinference.span.kind") + if span_kind: + return span_kind.replace("_", " ").title() + + return "Workflow" + + return span_name + + +def _transform_span_attributes_for_braintrust(span) -> None: + """Transform span attributes from OpenInference to Braintrust GenAI conventions. + + This modifies the span's attributes in-place to map OpenInference semantic + conventions to Braintrust's expected GenAI semantic conventions, including + proper span type classification and improved span naming. + + Args: + span: The OtelSpan to transform. + """ + if not hasattr(span, '_attributes') or span._attributes is None: + return + + attrs = span._attributes + + # Improve span name for better display in Braintrust UI + if hasattr(span, '_name') and span._name: + span._name = _get_improved_span_name(span._name, attrs) + + # Map OpenInference attribute names to Braintrust GenAI conventions + for old_key, new_key in BRAINTRUST_ATTRIBUTE_MAPPINGS.items(): + if old_key in attrs: + attrs[new_key] = attrs[old_key] + + # Map OpenInference span kind to Braintrust span type + # This ensures proper categorization of spans (llm, tool, task, etc.) + openinference_kind = attrs.get("openinference.span.kind") + if openinference_kind: + bt_type = OPENINFERENCE_TO_BRAINTRUST_TYPE.get(openinference_kind, "task") + attrs["braintrust.span_attributes"] = json.dumps({"type": bt_type}) + + # Remove redundant attributes to reduce metadata clutter + # These are captured elsewhere in Braintrust schema (input/output fields, metrics, span_attributes) + for key in BRAINTRUST_REDUNDANT_ATTRIBUTES: + attrs.pop(key, None) + + +@register_telemetry_exporter(config_type=BraintrustTelemetryExporter) +async def braintrust_telemetry_exporter(config: BraintrustTelemetryExporter, builder: Builder): + """Create a Braintrust telemetry exporter.""" + + from nat.plugins.opentelemetry import OTLPSpanAdapterExporter + from nat.plugins.opentelemetry.otel_span import OtelSpan + + api_key = get_secret_value(config.api_key) if config.api_key else os.environ.get("BRAINTRUST_API_KEY") + if not api_key: + raise ValueError("API key is required for Braintrust") + + headers = { + "Authorization": f"Bearer {api_key}", + "x-bt-parent": f"project_name:{config.project}", + } + + class BraintrustOTLPSpanAdapterExporter(OTLPSpanAdapterExporter): + + async def export_otel_spans(self, spans: list[OtelSpan]) -> None: + for span in spans: + _transform_span_attributes_for_braintrust(span) + await super().export_otel_spans(spans) + + yield BraintrustOTLPSpanAdapterExporter( + endpoint=config.endpoint, + headers=headers, + resource_attributes=config.resource_attributes, + batch_size=config.batch_size, + flush_interval=config.flush_interval, + max_queue_size=config.max_queue_size, + drop_on_overflow=config.drop_on_overflow, + shutdown_timeout=config.shutdown_timeout, + ) diff --git a/packages/nvidia_nat_opentelemetry/tests/observability/test_braintrust_exporter.py b/packages/nvidia_nat_opentelemetry/tests/observability/test_braintrust_exporter.py new file mode 100644 index 0000000000..4970853b29 --- /dev/null +++ b/packages/nvidia_nat_opentelemetry/tests/observability/test_braintrust_exporter.py @@ -0,0 +1,486 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Tests for Braintrust telemetry exporter integration.""" + +import json +import os +from unittest.mock import AsyncMock +from unittest.mock import Mock +from unittest.mock import patch + +import pytest + +from nat.plugins.opentelemetry.register import ( + BRAINTRUST_ATTRIBUTE_MAPPINGS, + BRAINTRUST_REDUNDANT_ATTRIBUTES, + OPENINFERENCE_TO_BRAINTRUST_TYPE, + BraintrustTelemetryExporter, + _get_improved_span_name, + _transform_span_attributes_for_braintrust, + braintrust_telemetry_exporter, +) + + +class TestBraintrustAttributeMappings: + """Test attribute mapping constants.""" + + def test_attribute_mappings_exist(self): + """Test that all expected attribute mappings are defined.""" + expected_mappings = { + "input.value": "gen_ai.prompt", + "output.value": "gen_ai.completion", + "llm.token_count.prompt": "gen_ai.usage.prompt_tokens", + "llm.token_count.completion": "gen_ai.usage.completion_tokens", + "llm.token_count.total": "gen_ai.usage.total_tokens", + "llm.model_name": "gen_ai.request.model", + } + assert BRAINTRUST_ATTRIBUTE_MAPPINGS == expected_mappings + + def test_span_type_mappings_exist(self): + """Test that all expected span type mappings are defined.""" + expected_types = { + "LLM": "llm", + "CHAIN": "task", + "TOOL": "tool", + "AGENT": "task", + "EMBEDDING": "task", + "RETRIEVER": "task", + "RERANKER": "task", + "GUARDRAIL": "task", + "EVALUATOR": "score", + "UNKNOWN": "task", + } + assert OPENINFERENCE_TO_BRAINTRUST_TYPE == expected_types + + def test_redundant_attributes_defined(self): + """Test that redundant attributes to remove are defined.""" + expected_redundant = { + "input.value", + "output.value", + "input.mime_type", + "output.mime_type", + "nat.event_timestamp", + "nat.span.kind", + "openinference.span.kind", + "llm.token_count.prompt", + "llm.token_count.completion", + "llm.token_count.total", + "llm.model_name", + "nat.metadata.mime_type", + } + assert BRAINTRUST_REDUNDANT_ATTRIBUTES == expected_redundant + + +class TestGetImprovedSpanName: + """Test the _get_improved_span_name function.""" + + def test_workflow_span_returns_workflow(self): + """Test that span with WORKFLOW event type returns 'Workflow'.""" + attrs = {"nat.event_type": "WORKFLOW_START"} + result = _get_improved_span_name("", attrs) + assert result == "Workflow" + + def test_workflow_span_with_function_event_returns_function(self): + """Test that span with FUNCTION event type returns 'Function'.""" + attrs = {"nat.event_type": "FUNCTION_START"} + result = _get_improved_span_name("", attrs) + assert result == "Function" + + def test_workflow_span_with_agent_event_returns_agent(self): + """Test that span with AGENT event type returns 'Agent'.""" + attrs = {"nat.event_type": "AGENT_START"} + result = _get_improved_span_name("", attrs) + assert result == "Agent" + + def test_workflow_span_with_function_name_uses_function_name(self): + """Test that span with valid function name uses that name.""" + attrs = {"nat.function.name": "my_custom_function", "nat.event_type": "WORKFLOW_START"} + result = _get_improved_span_name("", attrs) + assert result == "my_custom_function" + + def test_workflow_span_ignores_root_function_name(self): + """Test that span ignores 'root' function name.""" + attrs = {"nat.function.name": "root", "nat.event_type": "WORKFLOW_START"} + result = _get_improved_span_name("", attrs) + assert result == "Workflow" + + def test_workflow_span_ignores_workflow_function_name(self): + """Test that span ignores '' function name.""" + attrs = {"nat.function.name": "", "nat.event_type": "FUNCTION_START"} + result = _get_improved_span_name("", attrs) + assert result == "Function" + + def test_workflow_span_with_span_kind_fallback(self): + """Test that span falls back to span kind.""" + attrs = {"nat.span.kind": "CUSTOM_TYPE"} + result = _get_improved_span_name("", attrs) + assert result == "Custom Type" + + def test_workflow_span_with_openinference_kind_fallback(self): + """Test that span falls back to openinference span kind.""" + attrs = {"openinference.span.kind": "RETRIEVER"} + result = _get_improved_span_name("", attrs) + assert result == "Retriever" + + def test_workflow_span_default_fallback(self): + """Test that span defaults to 'Workflow'.""" + attrs = {} + result = _get_improved_span_name("", attrs) + assert result == "Workflow" + + def test_non_workflow_span_unchanged(self): + """Test that non-workflow span names are unchanged.""" + attrs = {"nat.event_type": "LLM_START"} + result = _get_improved_span_name("gpt-4o-mini", attrs) + assert result == "gpt-4o-mini" + + def test_tool_span_unchanged(self): + """Test that tool span names are unchanged.""" + attrs = {"nat.event_type": "TOOL_START"} + result = _get_improved_span_name("calculator.add", attrs) + assert result == "calculator.add" + + +class TestTransformSpanAttributesForBraintrust: + """Test the _transform_span_attributes_for_braintrust function.""" + + def test_transforms_input_value(self): + """Test that input.value is mapped to gen_ai.prompt and then removed.""" + span = Mock() + span._name = "test_span" + span._attributes = {"input.value": "Hello, world!"} + + _transform_span_attributes_for_braintrust(span) + + assert span._attributes["gen_ai.prompt"] == "Hello, world!" + assert "input.value" not in span._attributes # Redundant attribute removed + + def test_transforms_output_value(self): + """Test that output.value is mapped to gen_ai.completion and then removed.""" + span = Mock() + span._name = "test_span" + span._attributes = {"output.value": "Response text"} + + _transform_span_attributes_for_braintrust(span) + + assert span._attributes["gen_ai.completion"] == "Response text" + assert "output.value" not in span._attributes # Redundant attribute removed + + def test_transforms_token_counts(self): + """Test that token counts are mapped correctly.""" + span = Mock() + span._name = "test_span" + span._attributes = { + "llm.token_count.prompt": 100, + "llm.token_count.completion": 50, + "llm.token_count.total": 150, + } + + _transform_span_attributes_for_braintrust(span) + + assert span._attributes["gen_ai.usage.prompt_tokens"] == 100 + assert span._attributes["gen_ai.usage.completion_tokens"] == 50 + assert span._attributes["gen_ai.usage.total_tokens"] == 150 + + def test_transforms_model_name(self): + """Test that llm.model_name is mapped to gen_ai.request.model and then removed.""" + span = Mock() + span._name = "gpt-4o-mini" + span._attributes = {"llm.model_name": "gpt-4o-mini"} + + _transform_span_attributes_for_braintrust(span) + + assert span._attributes["gen_ai.request.model"] == "gpt-4o-mini" + assert "llm.model_name" not in span._attributes + + def test_sets_braintrust_span_type_for_llm(self): + """Test that LLM span kind maps to llm type.""" + span = Mock() + span._name = "gpt-4o" + span._attributes = {"openinference.span.kind": "LLM"} + + _transform_span_attributes_for_braintrust(span) + + span_attrs = json.loads(span._attributes["braintrust.span_attributes"]) + assert span_attrs["type"] == "llm" + + def test_sets_braintrust_span_type_for_tool(self): + """Test that TOOL span kind maps to tool type.""" + span = Mock() + span._name = "calculator" + span._attributes = {"openinference.span.kind": "TOOL"} + + _transform_span_attributes_for_braintrust(span) + + span_attrs = json.loads(span._attributes["braintrust.span_attributes"]) + assert span_attrs["type"] == "tool" + + def test_sets_braintrust_span_type_for_chain(self): + """Test that CHAIN span kind maps to task type.""" + span = Mock() + span._name = "workflow" + span._attributes = {"openinference.span.kind": "CHAIN"} + + _transform_span_attributes_for_braintrust(span) + + span_attrs = json.loads(span._attributes["braintrust.span_attributes"]) + assert span_attrs["type"] == "task" + + def test_sets_braintrust_span_type_for_evaluator(self): + """Test that EVALUATOR span kind maps to score type.""" + span = Mock() + span._name = "evaluator" + span._attributes = {"openinference.span.kind": "EVALUATOR"} + + _transform_span_attributes_for_braintrust(span) + + span_attrs = json.loads(span._attributes["braintrust.span_attributes"]) + assert span_attrs["type"] == "score" + + def test_unknown_span_kind_defaults_to_task(self): + """Test that unknown span kind defaults to task type.""" + span = Mock() + span._name = "unknown" + span._attributes = {"openinference.span.kind": "SOME_NEW_TYPE"} + + _transform_span_attributes_for_braintrust(span) + + span_attrs = json.loads(span._attributes["braintrust.span_attributes"]) + assert span_attrs["type"] == "task" + + def test_improves_workflow_span_name(self): + """Test that span name is improved.""" + span = Mock() + span._name = "" + span._attributes = {"nat.event_type": "WORKFLOW_START"} + + _transform_span_attributes_for_braintrust(span) + + assert span._name == "Workflow" + + def test_removes_all_redundant_attributes(self): + """Test that all redundant attributes are removed after transformation.""" + span = Mock() + span._name = "test_span" + span._attributes = { + "input.value": "Hello", + "output.value": "World", + "input.mime_type": "text/plain", + "output.mime_type": "text/plain", + "nat.event_timestamp": 1234567890, + "nat.span.kind": "WORKFLOW", + "openinference.span.kind": "CHAIN", + "llm.token_count.prompt": 100, + "llm.token_count.completion": 50, + "llm.token_count.total": 150, + "nat.metadata.mime_type": "application/json", + "nat.workflow.run_id": "abc123", # Should be kept + "nat.function.name": "my_func", # Should be kept + } + + _transform_span_attributes_for_braintrust(span) + + # Verify redundant attributes are removed + for attr in BRAINTRUST_REDUNDANT_ATTRIBUTES: + assert attr not in span._attributes, f"{attr} should have been removed" + + # Verify mapped attributes exist + assert span._attributes["gen_ai.prompt"] == "Hello" + assert span._attributes["gen_ai.completion"] == "World" + assert span._attributes["gen_ai.usage.prompt_tokens"] == 100 + assert span._attributes["gen_ai.usage.completion_tokens"] == 50 + assert span._attributes["gen_ai.usage.total_tokens"] == 150 + + # Verify non-redundant NAT attributes are preserved + assert span._attributes["nat.workflow.run_id"] == "abc123" + assert span._attributes["nat.function.name"] == "my_func" + + def test_handles_missing_attributes(self): + """Test that function handles span with no attributes.""" + span = Mock() + span._attributes = None + + # Should not raise + _transform_span_attributes_for_braintrust(span) + + def test_handles_span_without_attributes_attr(self): + """Test that function handles span without _attributes.""" + span = object() # No _attributes attribute + + # Should not raise + _transform_span_attributes_for_braintrust(span) + + +class TestBraintrustTelemetryExporterConfig: + """Test BraintrustTelemetryExporter configuration.""" + + def test_default_endpoint(self): + """Test that default endpoint is set correctly.""" + config = BraintrustTelemetryExporter(project="test-project") + assert config.endpoint == "https://api.braintrust.dev/otel/v1/traces" + + def test_custom_endpoint(self): + """Test that custom endpoint can be set.""" + config = BraintrustTelemetryExporter( + project="test-project", + endpoint="https://custom.endpoint.com/traces" + ) + assert config.endpoint == "https://custom.endpoint.com/traces" + + def test_project_required(self): + """Test that project field is set.""" + config = BraintrustTelemetryExporter(project="my-project") + assert config.project == "my-project" + + def test_api_key_can_be_set(self): + """Test that API key can be provided in config.""" + config = BraintrustTelemetryExporter( + project="test-project", + api_key="test-api-key" + ) + assert config.api_key.get_secret_value() == "test-api-key" + + def test_resource_attributes_default_empty(self): + """Test that resource_attributes defaults to empty dict.""" + config = BraintrustTelemetryExporter(project="test-project") + assert config.resource_attributes == {} + + def test_resource_attributes_can_be_set(self): + """Test that resource_attributes can be set.""" + config = BraintrustTelemetryExporter( + project="test-project", + resource_attributes={"service.name": "my-service"} + ) + assert config.resource_attributes == {"service.name": "my-service"} + + +class TestBraintrustTelemetryExporterFactory: + """Test the braintrust_telemetry_exporter async factory function.""" + + @pytest.fixture + def mock_builder(self): + """Create a mock Builder.""" + return Mock() + + @pytest.fixture + def config_with_key(self): + """Config with API key set directly.""" + return BraintrustTelemetryExporter( + project="test-project", + api_key="test-api-key-123", + ) + + @pytest.fixture + def config_without_key(self): + """Config without API key.""" + return BraintrustTelemetryExporter( + project="test-project", + ) + + @patch("nat.plugins.opentelemetry.mixin.otlp_span_exporter_mixin.OTLPSpanExporterHTTP") + async def test_factory_creates_exporter_with_config_api_key( + self, mock_otlp_http, config_with_key, mock_builder + ): + """Test that the factory creates an exporter using the config API key.""" + async with braintrust_telemetry_exporter(config_with_key, mock_builder) as exporter: + pass + + call_kwargs = mock_otlp_http.call_args[1] + assert call_kwargs["headers"]["Authorization"] == "Bearer test-api-key-123" + + @patch("nat.plugins.opentelemetry.mixin.otlp_span_exporter_mixin.OTLPSpanExporterHTTP") + async def test_factory_creates_exporter_with_correct_project_header( + self, mock_otlp_http, config_with_key, mock_builder + ): + """Test that the factory sets the x-bt-parent header with project name.""" + async with braintrust_telemetry_exporter(config_with_key, mock_builder) as exporter: + pass + + call_kwargs = mock_otlp_http.call_args[1] + assert call_kwargs["headers"]["x-bt-parent"] == "project_name:test-project" + + @patch.dict(os.environ, {"BRAINTRUST_API_KEY": "env-api-key-456"}) + @patch("nat.plugins.opentelemetry.mixin.otlp_span_exporter_mixin.OTLPSpanExporterHTTP") + async def test_factory_falls_back_to_env_var( + self, mock_otlp_http, config_without_key, mock_builder + ): + """Test that the factory falls back to BRAINTRUST_API_KEY env var.""" + async with braintrust_telemetry_exporter(config_without_key, mock_builder) as exporter: + pass + + call_kwargs = mock_otlp_http.call_args[1] + assert call_kwargs["headers"]["Authorization"] == "Bearer env-api-key-456" + + @patch.dict(os.environ, {}, clear=True) + async def test_factory_raises_without_api_key(self, config_without_key, mock_builder): + """Test that the factory raises ValueError when no API key is available.""" + with pytest.raises(ValueError, match="API key is required for Braintrust"): + async with braintrust_telemetry_exporter(config_without_key, mock_builder) as exporter: + pass + + @patch("nat.plugins.opentelemetry.mixin.otlp_span_exporter_mixin.OTLPSpanExporterHTTP") + async def test_factory_passes_batch_config(self, mock_otlp_http, mock_builder): + """Test that batch configuration is passed through to the exporter.""" + config = BraintrustTelemetryExporter( + project="test-project", + api_key="test-key", + batch_size=100, + flush_interval=10.0, + max_queue_size=500, + ) + + async with braintrust_telemetry_exporter(config, mock_builder) as exporter: + assert exporter is not None + assert type(exporter).__name__ == "BraintrustOTLPSpanAdapterExporter" + + @patch("nat.plugins.opentelemetry.mixin.otlp_span_exporter_mixin.OTLPSpanExporterHTTP") + async def test_factory_passes_endpoint(self, mock_otlp_http, mock_builder): + """Test that custom endpoint is passed through to the exporter.""" + config = BraintrustTelemetryExporter( + project="test-project", + api_key="test-key", + endpoint="https://custom.braintrust.dev/otel/v1/traces", + ) + + async with braintrust_telemetry_exporter(config, mock_builder) as exporter: + pass + + call_kwargs = mock_otlp_http.call_args[1] + assert call_kwargs["endpoint"] == "https://custom.braintrust.dev/otel/v1/traces" + + @patch("nat.plugins.opentelemetry.mixin.otlp_span_exporter_mixin.OTLPSpanExporterHTTP") + async def test_factory_yields_braintrust_subclass(self, mock_otlp_http, config_with_key, mock_builder): + """Test that the factory yields a BraintrustOTLPSpanAdapterExporter subclass.""" + async with braintrust_telemetry_exporter(config_with_key, mock_builder) as exporter: + assert type(exporter).__name__ == "BraintrustOTLPSpanAdapterExporter" + + @patch("nat.plugins.opentelemetry.mixin.otlp_span_exporter_mixin.OTLPSpanExporterHTTP") + async def test_subclass_transforms_spans_before_export(self, mock_otlp_http, config_with_key, mock_builder): + """Test that the subclass transforms span attributes before calling super().export_otel_spans().""" + async with braintrust_telemetry_exporter(config_with_key, mock_builder) as exporter: + span = Mock() + span._name = "" + span._attributes = { + "input.value": "Hello", + "openinference.span.kind": "LLM", + "nat.event_type": "WORKFLOW_START", + } + span.set_resource = Mock() + + await exporter.export_otel_spans([span]) + + assert span._attributes.get("gen_ai.prompt") == "Hello" + assert "input.value" not in span._attributes