Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions ci/vale/styles/config/vocabularies/nat/accept.txt
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ Authlib
[Bb]ackpressure
[Bb]atcher
[Bb]oolean
Braintrust
[Bb]rev
[Cc]allable(s?)
# Documentation for ccache only capitalizes the name at the start of a sentence https://ccache.dev/
Expand Down
Binary file added docs/source/_static/braintrust-trace.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
126 changes: 126 additions & 0 deletions docs/source/run-workflows/observe/observe-workflow-with-braintrust.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
<!--
SPDX-FileCopyrightText: Copyright (c) 2025-2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
SPDX-License-Identifier: Apache-2.0

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->

# Observing a Workflow with Braintrust

This guide provides a step-by-step process to enable observability in a NeMo Agent Toolkit workflow using Braintrust for tracing. By the end of this guide, you will have:

- Configured telemetry in your workflow.
- Ability to view traces in the Braintrust platform.

## Step 1: Create a Braintrust Account

1. Visit [https://www.braintrust.dev](https://www.braintrust.dev) and sign up for an account.
2. Once logged in, navigate to your organization settings to generate an API key.

## Step 2: Create a Project

Create a new project in Braintrust to organize your traces:

1. Navigate to the Braintrust dashboard.
2. Click on **Projects** in the sidebar.
3. Click **+ New Project**.
4. Name your project (e.g., `nat-calculator`).
5. Note down the **Project Name** for configuration.

## Step 3: Configure Your Environment

Set the following environment variables in your terminal:

```bash
export BRAINTRUST_API_KEY=<your_api_key>
```

Alternatively, you can provide the API key directly in your workflow configuration.

## Step 4: Install the NeMo Agent Toolkit OpenTelemetry Subpackages

```bash
# Install specific telemetry extras required for Braintrust
uv pip install -e '.[opentelemetry]'
```

## Step 5: Modify NeMo Agent Toolkit Workflow Configuration

Update your workflow configuration file to include the telemetry settings.

Example configuration:
```yaml
general:
telemetry:
tracing:
braintrust:
_type: braintrust
project: nat-calculator
# Optional: Override the default endpoint for self-hosted deployments
# endpoint: https://api.braintrust.dev/otel/v1/traces
```

You can also specify the API key directly in the configuration:
```yaml
general:
telemetry:
tracing:
braintrust:
_type: braintrust
project: nat-calculator
api_key: ${BRAINTRUST_API_KEY}
```

## Step 6: Run the workflow

From the root directory of the NeMo Agent Toolkit library, install dependencies and run the pre-configured `simple_calculator_observability` example.

**Example:**

```bash
# Install the workflow and plugins
uv pip install -e examples/observability/simple_calculator_observability/

# Run the workflow with Braintrust telemetry settings
nat run --config_file examples/observability/simple_calculator_observability/configs/config-braintrust.yml --input "What is 1*2?"
```

As the workflow runs, telemetry data will start showing up in Braintrust.

## Step 7: Analyze Traces Data in Braintrust

Analyze the traces in Braintrust:

1. Navigate to [https://www.braintrust.dev](https://www.braintrust.dev) and log in.
2. Go to **Projects** > **nat-calculator** (or your project name).
3. Click on **Logs** to view your traces.
4. Select any trace to view detailed span information, inputs, outputs, and timing data.

```{figure} /_static/braintrust-trace.png
:alt: Braintrust Trace View
:align: center

Example trace view in Braintrust showing workflow spans, inputs, outputs, and timing data.
```

## Additional Features

Braintrust provides additional observability features beyond basic tracing:

- **Evaluation**: Run automated evaluations on your AI outputs with built-in and custom scorers.
- **Experiments**: Compare different model configurations and prompt variations.
- **Datasets**: Curate golden datasets from your production traces.
- **Prompt Management**: Version and deploy prompts with A/B testing capabilities.
- **Human Review**: Set up review queues for team-based quality analysis.

For additional help, see the [Braintrust documentation](https://www.braintrust.dev/docs).
8 changes: 8 additions & 0 deletions docs/source/run-workflows/observe/observe.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,7 @@ The following table lists each exporter with its supported features and configur

| Provider | Integration Documentation | Supported Features |
| -------- | ------------------------- | ------------------ |
| [Braintrust](https://www.braintrust.dev/) | [Observing with Braintrust](?provider=Braintrust#provider-integration-guides){.external} | Logging, Tracing, Evaluation, Human Review |
| [Catalyst](https://catalyst.raga.ai/) | [Observing with Catalyst](?provider=Catalyst#provider-integration-guides){.external} | Logging, Tracing |
| [NVIDIA Data Flywheel Blueprint](https://build.nvidia.com/nvidia/build-an-enterprise-data-flywheel) | [Observing with Data Flywheel](?provider=Data-Flywheel#provider-integration-guides){.external} | Logging, Tracing |
| [DBNL](https://distributional.com/) | [Observing with DBNL](?provider=DBNL#provider-integration-guides){.external} | Logging, Tracing |
Expand Down Expand Up @@ -189,6 +190,13 @@ For complete information about developing and integrating custom telemetry expor
::::{tab-set}
:sync-group: provider

:::{tab-item} Braintrust
:sync: Braintrust

:::{include} ./observe-workflow-with-braintrust.md

:::

:::{tab-item} Catalyst
:sync: Catalyst

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# SPDX-FileCopyrightText: Copyright (c) 2024-2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.


general:
telemetry:
logging:
console:
_type: console
level: WARN
tracing:
braintrust:
_type: braintrust
project: nat-calculator

function_groups:
calculator:
_type: calculator

functions:
current_datetime:
_type: current_datetime

llms:
openai_llm:
_type: openai
model_name: gpt-4o-mini
temperature: 0.0

workflow:
_type: react_agent
tool_names: [calculator, current_datetime]
llm_name: openai_llm
verbose: true
parse_agent_response_max_retries: 3
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
# See the License for the specific language governing permissions and
# limitations under the License.

import json
import logging
import os

Expand Down Expand Up @@ -242,3 +243,173 @@ async def dbnl_telemetry_exporter(config: DBNLTelemetryExporter, builder: Builde
drop_on_overflow=config.drop_on_overflow,
shutdown_timeout=config.shutdown_timeout,
)


class BraintrustTelemetryExporter(BatchConfigMixin, CollectorConfigMixin, TelemetryExporterBaseConfig, name="braintrust"):
"""A telemetry exporter to transmit traces to Braintrust for AI observability and evaluation."""

endpoint: str = Field(
description="The Braintrust OTEL endpoint",
default="https://api.braintrust.dev/otel/v1/traces",
)
api_key: SerializableSecretStr = Field(description="The Braintrust API key",
default_factory=lambda: SerializableSecretStr(""))
resource_attributes: dict[str, str] = Field(default_factory=dict,
description="The resource attributes to add to the span")


# Attribute mappings from OpenInference to Braintrust GenAI semantic conventions
BRAINTRUST_ATTRIBUTE_MAPPINGS = {
"input.value": "gen_ai.prompt",
"output.value": "gen_ai.completion",
"llm.token_count.prompt": "gen_ai.usage.prompt_tokens",
"llm.token_count.completion": "gen_ai.usage.completion_tokens",
"llm.token_count.total": "gen_ai.usage.total_tokens",
"llm.model_name": "gen_ai.request.model",
}

# Attributes to remove after mapping (redundant - captured elsewhere in Braintrust schema)
# These are removed to reduce metadata clutter while preserving all information:
# - input.value/output.value -> extracted to Braintrust input/output fields via gen_ai.*
# - MIME types -> not needed for Braintrust display
# - nat.event_timestamp -> captured in metrics.start/metrics.end
# - nat.span.kind/openinference.span.kind -> mapped to span_attributes.type
# - llm.token_count.* -> mapped to gen_ai.usage.*
# - nat.metadata.mime_type -> not needed if nat.metadata exists
BRAINTRUST_REDUNDANT_ATTRIBUTES = {
"input.value",
"output.value",
"input.mime_type",
"output.mime_type",
"nat.event_timestamp",
"nat.span.kind",
"openinference.span.kind",
"llm.token_count.prompt",
"llm.token_count.completion",
"llm.token_count.total",
"llm.model_name",
"nat.metadata.mime_type",
}

# Map OpenInference span kinds to Braintrust span types
# See: https://www.braintrust.dev/docs/reference/span-types
OPENINFERENCE_TO_BRAINTRUST_TYPE = {
"LLM": "llm",
"CHAIN": "task",
"TOOL": "tool",
"AGENT": "task",
"EMBEDDING": "task",
"RETRIEVER": "task",
"RERANKER": "task",
"GUARDRAIL": "task",
"EVALUATOR": "score",
"UNKNOWN": "task",
}


def _get_improved_span_name(span_name: str, attrs: dict) -> str:
"""Generate an improved span name for better display in Braintrust UI.

Args:
span_name: The original span name.
attrs: The span attributes dictionary.

Returns:
An improved span name for display.
"""
# Handle the generic <workflow> name
if span_name == "<workflow>":
# Try to get a more descriptive name from attributes
function_name = attrs.get("nat.function.name")
if function_name and function_name != "<workflow>" and function_name != "root":
return function_name

# Use event type to create a descriptive name
event_type = attrs.get("nat.event_type", "")
if "WORKFLOW" in event_type:
return "Workflow"
elif "FUNCTION" in event_type:
return "Function"
elif "AGENT" in event_type:
return "Agent"

# Fall back to span kind if available
span_kind = attrs.get("nat.span.kind") or attrs.get("openinference.span.kind")
if span_kind:
return span_kind.replace("_", " ").title()

return "Workflow"

return span_name


def _transform_span_attributes_for_braintrust(span) -> None:
"""Transform span attributes from OpenInference to Braintrust GenAI conventions.

This modifies the span's attributes in-place to map OpenInference semantic
conventions to Braintrust's expected GenAI semantic conventions, including
proper span type classification and improved span naming.

Args:
span: The OtelSpan to transform.
"""
if not hasattr(span, '_attributes') or span._attributes is None:
return

attrs = span._attributes

# Improve span name for better display in Braintrust UI
if hasattr(span, '_name') and span._name:
span._name = _get_improved_span_name(span._name, attrs)

# Map OpenInference attribute names to Braintrust GenAI conventions
for old_key, new_key in BRAINTRUST_ATTRIBUTE_MAPPINGS.items():
if old_key in attrs:
attrs[new_key] = attrs[old_key]

# Map OpenInference span kind to Braintrust span type
# This ensures proper categorization of spans (llm, tool, task, etc.)
openinference_kind = attrs.get("openinference.span.kind")
if openinference_kind:
bt_type = OPENINFERENCE_TO_BRAINTRUST_TYPE.get(openinference_kind, "task")
attrs["braintrust.span_attributes"] = json.dumps({"type": bt_type})

# Remove redundant attributes to reduce metadata clutter
# These are captured elsewhere in Braintrust schema (input/output fields, metrics, span_attributes)
for key in BRAINTRUST_REDUNDANT_ATTRIBUTES:
attrs.pop(key, None)


@register_telemetry_exporter(config_type=BraintrustTelemetryExporter)
async def braintrust_telemetry_exporter(config: BraintrustTelemetryExporter, builder: Builder):
"""Create a Braintrust telemetry exporter."""

from nat.plugins.opentelemetry import OTLPSpanAdapterExporter
from nat.plugins.opentelemetry.otel_span import OtelSpan

api_key = get_secret_value(config.api_key) if config.api_key else os.environ.get("BRAINTRUST_API_KEY")
if not api_key:
raise ValueError("API key is required for Braintrust")

headers = {
"Authorization": f"Bearer {api_key}",
"x-bt-parent": f"project_name:{config.project}",
}

class BraintrustOTLPSpanAdapterExporter(OTLPSpanAdapterExporter):

async def export_otel_spans(self, spans: list[OtelSpan]) -> None:
for span in spans:
_transform_span_attributes_for_braintrust(span)
await super().export_otel_spans(spans)

yield BraintrustOTLPSpanAdapterExporter(
endpoint=config.endpoint,
headers=headers,
resource_attributes=config.resource_attributes,
batch_size=config.batch_size,
flush_interval=config.flush_interval,
max_queue_size=config.max_queue_size,
drop_on_overflow=config.drop_on_overflow,
shutdown_timeout=config.shutdown_timeout,
)
Loading