Skip to content

Commit e378841

Browse files
docs: add tracemetrics dataset guidance and validate aggregate format (#636)
## Summary - Add comprehensive documentation for the `tracemetrics` dataset to prevent future agents from using the wrong query format - Add `validateAggregateNames()` guard that rejects span-style aggregates when `--dataset tracemetrics` is used - Document common dashboard widget mistakes in agent guidance ## Context While building the CLI Performance dashboard, several mistakes were made with custom metrics widgets that could have been prevented with better docs and validation: 1. **Used `--dataset metrics` with MRI format** (`d:custom/name@unit`) instead of `--dataset tracemetrics` with comma-separated format (`aggregation(value,name,type,unit)`) — SDK v10+ emits `trace_metric` items, not standalone custom metrics 2. **Wrong MRI unit suffixes** (e.g., `@millisecond` when SDK emits with no unit → `@none`) — caused "Internal Error" on every metrics widget 3. **Missing `--limit` on grouped widgets** — API rejects grouped queries without limit 4. **`--sort` referencing fields not in `--query`** — caused 400 errors (e.g., sorting by `count()` after removing it from aggregates) 5. **Tried to aggregate span attributes** (`avg:dsn.files_collected`) — span attributes are metadata, not aggregatable measurements ## Changes | File | Change | |------|--------| | `docs/src/content/docs/agent-guidance.md` | Dataset selection guide, tracemetrics format table with examples, common dashboard widget mistakes | | `src/types/dashboard.ts` | `isTracemetricsAggregate()` validator + tracemetrics branch in `validateAggregateNames()` | | `AGENTS.md` | Lore entry documenting tracemetrics gotchas | | `plugins/sentry-cli/skills/sentry-cli/SKILL.md` | Auto-regenerated | --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
1 parent 5ea70f6 commit e378841

File tree

9 files changed

+259
-40
lines changed

9 files changed

+259
-40
lines changed

AGENTS.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -952,6 +952,8 @@ mock.module("./some-module", () => ({
952952
<!-- lore:019d3e8a-a4bb-7271-98cf-4cf418f2f581 -->
953953
* **CLI telemetry command tags use sentry. prefix with dots not bare names**: The \`buildCommand\` wrapper sets the \`command\` telemetry tag using the full Stricli command prefix joined with dots: \`sentry.issue.explain\`, \`sentry.issue.list\`, \`sentry.api\`, etc. — NOT bare names like \`issue.explain\`. When querying Sentry Discover or building dashboard widgets, always use the \`sentry.\` prefix. Verify actual tag values with a Discover query (\`field:command, count()\`, grouped by \`command\`) before assuming the format.
954954
955+
* **Dashboard tracemetrics dataset uses comma-separated aggregate format**: SDK v10+ custom metrics (`Sentry.metrics.distribution()`, `.gauge()`, `.count()`) emit `trace_metric` envelope items. Dashboard widgets for these MUST use `--dataset tracemetrics` with aggregate format `aggregation(value,metric_name,metric_type,unit)` — e.g., `p50(value,completion.duration_ms,distribution,none)`. The `unit` parameter must match the SDK emission exactly: `none` if no unit specified, `byte` for memory metrics, `second` for uptime. `tracemetrics` only supports `line`, `area`, `bar`, `big_number`, `categorical_bar` display types — no `table` or `stacked_area`. Widgets with `--group-by` always require `--limit`. Sort expressions must reference aggregates present in `--query`.
956+
955957
<!-- lore:019d0846-17bd-7ff3-a6d7-09b59b69a8fe -->
956958
* **Use toMatchObject not toEqual when testing resolution results with optional fields**: When \`resolveProjectBySlug()\` or \`resolveOrgProjectTarget()\` adds optional fields (like \`projectData\`) to the return type, tests using \`expect(result).toEqual({ org, project })\` fail because \`toEqual\` requires exact match. Use \`toMatchObject({ org, project })\` instead — it checks the specified subset without failing on extra properties. This affects tests across \`event/view\`, \`log/view\`, \`trace/view\`, and \`trace/list\` test files.
957959

docs/src/content/docs/agent-guidance.md

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -128,9 +128,7 @@ Display types with default sizes:
128128

129129
Use **common** types for general dashboards. Use **specialized** only when specifically requested. Avoid **internal** types unless the user explicitly asks.
130130

131-
Available datasets: `spans` (default, covers most use cases), `discover`, `issue`, `error-events`, `transaction-like`, `metrics`, `logs`, `tracemetrics`, `preprod-app-size`.
132-
133-
Run `sentry dashboard widget --help` for the full list including aggregate functions.
131+
Available datasets: `spans` (default), `tracemetrics`, `discover`, `issue`, `error-events`, `logs`. Run `sentry dashboard widget --help` for dataset descriptions, query formats, and examples.
134132

135133
**Row-filling examples:**
136134

plugins/sentry-cli/skills/sentry-cli/SKILL.md

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -138,9 +138,7 @@ Display types with default sizes:
138138

139139
Use **common** types for general dashboards. Use **specialized** only when specifically requested. Avoid **internal** types unless the user explicitly asks.
140140

141-
Available datasets: `spans` (default, covers most use cases), `discover`, `issue`, `error-events`, `transaction-like`, `metrics`, `logs`, `tracemetrics`, `preprod-app-size`.
142-
143-
Run `sentry dashboard widget --help` for the full list including aggregate functions.
141+
Available datasets: `spans` (default), `tracemetrics`, `discover`, `issue`, `error-events`, `logs`. Run `sentry dashboard widget --help` for dataset descriptions, query formats, and examples.
144142

145143
**Row-filling examples:**
146144

src/commands/dashboard/resolve.ts

Lines changed: 134 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@ import {
1818
ValidationError,
1919
} from "../../lib/errors.js";
2020
import { fuzzyMatch } from "../../lib/fuzzy.js";
21+
import { logger } from "../../lib/logger.js";
2122
import { resolveEffectiveOrg } from "../../lib/region.js";
2223
import { resolveOrg } from "../../lib/resolve-target.js";
2324
import { setOrgProjectContext } from "../../lib/telemetry.js";
@@ -315,6 +316,129 @@ export function resolveWidgetIndex(
315316
return matchIndex;
316317
}
317318

319+
/**
320+
* Validate that a sort expression references an aggregate present in the query.
321+
* The Sentry API returns 400 when the sort field isn't in the widget's aggregates.
322+
*
323+
* @param orderby - Parsed sort expression (e.g., "-count()", "p90(span.duration)")
324+
* @param aggregates - Parsed aggregate expressions from the query
325+
*/
326+
export function validateSortReferencesAggregate(
327+
orderby: string,
328+
aggregates: string[]
329+
): void {
330+
// Strip leading "-" for descending sorts
331+
const sortAgg = orderby.startsWith("-") ? orderby.slice(1) : orderby;
332+
if (!aggregates.includes(sortAgg)) {
333+
throw new ValidationError(
334+
`Sort expression "${orderby}" references "${sortAgg}" which is not in the query.\n\n` +
335+
"The --sort field must be one of the aggregate expressions in --query.\n" +
336+
`Current aggregates: ${aggregates.join(", ")}\n\n` +
337+
`Either add "${sortAgg}" to --query or sort by an existing aggregate.`,
338+
"sort"
339+
);
340+
}
341+
}
342+
343+
/**
344+
* Validate that grouped widgets (those with columns/group-by) include a limit.
345+
* The Sentry API rejects grouped widgets without a limit.
346+
*
347+
* @param columns - Group-by columns
348+
* @param limit - Widget limit (undefined if not set)
349+
*/
350+
export function validateGroupByRequiresLimit(
351+
columns: string[],
352+
limit: number | undefined
353+
): void {
354+
if (columns.length > 0 && limit === undefined) {
355+
throw new ValidationError(
356+
"Widgets with --group-by require --limit. " +
357+
"Add --limit <n> to specify the maximum number of groups to display.",
358+
"limit"
359+
);
360+
}
361+
}
362+
363+
const log = logger.withTag("dashboard");
364+
365+
/**
366+
* Known aggregatable fields for the spans dataset.
367+
*
368+
* Span attributes (e.g., dsn.files_collected, resolve.method) are key-value
369+
* metadata and cannot be used as aggregate fields — only in --where or --group-by.
370+
* This set covers built-in numeric fields that support aggregation.
371+
* Measurements (http.*, cache.*, etc.) are project-specific and may not be
372+
* exhaustive — we warn instead of error for unknown fields.
373+
*/
374+
const KNOWN_SPAN_AGGREGATE_FIELDS = new Set([
375+
"span.duration",
376+
"span.self_time",
377+
"http.response_content_length",
378+
"http.decoded_response_content_length",
379+
"http.response_transfer_size",
380+
"cache.item_size",
381+
]);
382+
383+
/**
384+
* Aggregate functions that require numeric measurement fields.
385+
* Functions like count_unique, any, count accept non-numeric columns
386+
* (e.g., transaction, span.op) and should not trigger the warning.
387+
*/
388+
const NUMERIC_AGGREGATE_FUNCTIONS = new Set([
389+
"avg",
390+
"sum",
391+
"min",
392+
"max",
393+
"p50",
394+
"p75",
395+
"p90",
396+
"p95",
397+
"p99",
398+
"p100",
399+
"percentile",
400+
]);
401+
402+
/**
403+
* Warn when a numeric aggregate function (avg, sum, p50, etc.) is applied
404+
* to a field that isn't a known aggregatable span measurement. Functions
405+
* like count_unique(transaction) or any(span.op) accept non-numeric
406+
* columns and are not checked.
407+
*
408+
* Only checks for the spans dataset.
409+
*/
410+
function warnUnknownAggregateFields(
411+
aggregates: string[],
412+
dataset: string | undefined
413+
): void {
414+
if (dataset && dataset !== "spans") {
415+
return;
416+
}
417+
for (const agg of aggregates) {
418+
const parenIdx = agg.indexOf("(");
419+
if (parenIdx < 0) {
420+
continue;
421+
}
422+
const fn = agg.slice(0, parenIdx);
423+
// Only check numeric aggregate functions — count_unique, any, etc. accept any column
424+
if (!NUMERIC_AGGREGATE_FUNCTIONS.has(fn)) {
425+
continue;
426+
}
427+
const inner = agg.slice(parenIdx + 1, -1);
428+
if (!inner) {
429+
continue;
430+
}
431+
if (!KNOWN_SPAN_AGGREGATE_FIELDS.has(inner)) {
432+
log.warn(
433+
`Aggregate field "${inner}" in "${agg}" is not a known aggregatable span field. ` +
434+
"Span attributes (custom tags) cannot be used with numeric aggregates — " +
435+
"use them in --where or --group-by instead. " +
436+
`Known numeric fields: ${[...KNOWN_SPAN_AGGREGATE_FIELDS].join(", ")}`
437+
);
438+
}
439+
}
440+
}
441+
318442
/**
319443
* Build a widget from user-provided flag values.
320444
*
@@ -336,6 +460,7 @@ export function buildWidgetFromFlags(opts: {
336460
}): DashboardWidget {
337461
const aggregates = (opts.query ?? ["count"]).map(parseAggregate);
338462
validateAggregateNames(aggregates, opts.dataset);
463+
warnUnknownAggregateFields(aggregates, opts.dataset);
339464

340465
// Issue table widgets need at least one column or the Sentry UI shows "Columns: None".
341466
// Default to ["issue"] for table display only — timeseries (line/area/bar) don't use columns.
@@ -350,6 +475,15 @@ export function buildWidgetFromFlags(opts: {
350475
orderby = `-${aggregates[0]}`;
351476
}
352477

478+
// Only validate when user explicitly passes --group-by, not for auto-defaulted columns
479+
// (e.g., issue dataset auto-defaults columns to ["issue"] for table display)
480+
if (opts.groupBy) {
481+
validateGroupByRequiresLimit(columns, opts.limit);
482+
}
483+
if (orderby) {
484+
validateSortReferencesAggregate(orderby, aggregates);
485+
}
486+
353487
const raw = {
354488
title: opts.title,
355489
displayType: opts.display,

src/commands/dashboard/widget/edit.ts

Lines changed: 58 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,8 @@ import {
3232
resolveDashboardId,
3333
resolveOrgFromTarget,
3434
resolveWidgetIndex,
35+
validateGroupByRequiresLimit,
36+
validateSortReferencesAggregate,
3537
validateWidgetEnums,
3638
type WidgetQueryFlags,
3739
} from "../resolve.js";
@@ -101,38 +103,76 @@ function mergeLayout(
101103
};
102104
}
103105

104-
/** Build the replacement widget object by merging flags over existing */
105-
function buildReplacement(
106+
/**
107+
* Validate enum and aggregate constraints on the effective (merged) widget state.
108+
* Extracted from buildReplacement to stay under Biome's complexity limit.
109+
*/
110+
function validateEnumsAndAggregates(
106111
flags: EditFlags,
107-
existing: DashboardWidget
108-
): DashboardWidget {
109-
const mergedQueries = mergeQueries(flags, existing.queries?.[0]);
110-
111-
// Validate aggregates when query or dataset changes — prevents broken widgets
112-
// (e.g. switching --dataset from discover to spans with discover-only aggregates)
112+
existing: DashboardWidget,
113+
mergedQueries: DashboardWidgetQuery[] | undefined
114+
): void {
113115
const newDataset = flags.dataset ?? existing.widgetType;
114116
const aggregatesToValidate =
115117
mergedQueries?.[0]?.aggregates ?? existing.queries?.[0]?.aggregates;
116118
if ((flags.query || flags.dataset) && aggregatesToValidate) {
117119
validateAggregateNames(aggregatesToValidate, newDataset);
118120
}
119121

120-
const limit = flags.limit !== undefined ? flags.limit : existing.limit;
121-
122-
const effectiveDisplay = flags.display ?? existing.displayType;
123-
const effectiveDataset = flags.dataset ?? existing.widgetType;
124-
125-
// Re-validate after merging with existing values. validateWidgetEnums only
126-
// checks the cross-constraint when both args are provided, so it misses
127-
// e.g. `--dataset preprod-app-size` on a widget that's already `table`.
128-
// validateWidgetEnums itself skips untracked display types (text, wheel, etc.).
129122
if (flags.display || flags.dataset) {
123+
const effectiveDisplay = flags.display ?? existing.displayType;
124+
const effectiveDataset = flags.dataset ?? existing.widgetType;
130125
validateWidgetEnums(effectiveDisplay, effectiveDataset);
131126
}
127+
}
128+
129+
/**
130+
* Validate group-by+limit and sort constraints on the effective (merged) widget state.
131+
* Only runs when the user changes query, group-by, or sort — not when preserving
132+
* existing widget state which may predate these validations.
133+
*/
134+
function validateQueryConstraints(
135+
flags: EditFlags,
136+
existing: DashboardWidget,
137+
mergedQueries: DashboardWidgetQuery[] | undefined,
138+
limit: number | null | undefined
139+
): void {
140+
// Only validate when user explicitly passes --group-by, not when merely
141+
// changing --query on an existing grouped widget (which may have auto-defaulted
142+
// columns like ["issue"] with no limit)
143+
if (flags["group-by"]) {
144+
const columns =
145+
mergedQueries?.[0]?.columns ?? existing.queries?.[0]?.columns ?? [];
146+
validateGroupByRequiresLimit(columns, limit ?? undefined);
147+
}
148+
149+
// Only validate sort when user explicitly passes --sort, not when merely
150+
// changing --query (which may leave the existing auto-defaulted sort stale)
151+
if (flags.sort) {
152+
const orderby =
153+
mergedQueries?.[0]?.orderby ?? existing.queries?.[0]?.orderby;
154+
const aggregates =
155+
mergedQueries?.[0]?.aggregates ?? existing.queries?.[0]?.aggregates ?? [];
156+
if (orderby && aggregates.length > 0) {
157+
validateSortReferencesAggregate(orderby, aggregates);
158+
}
159+
}
160+
}
161+
162+
/** Build the replacement widget object by merging flags over existing */
163+
function buildReplacement(
164+
flags: EditFlags,
165+
existing: DashboardWidget
166+
): DashboardWidget {
167+
const mergedQueries = mergeQueries(flags, existing.queries?.[0]);
168+
const limit = flags.limit !== undefined ? flags.limit : existing.limit;
169+
170+
validateEnumsAndAggregates(flags, existing, mergedQueries);
171+
validateQueryConstraints(flags, existing, mergedQueries, limit);
132172

133173
const raw: Record<string, unknown> = {
134174
title: flags["new-title"] ?? existing.title,
135-
displayType: effectiveDisplay,
175+
displayType: flags.display ?? existing.displayType,
136176
queries: mergedQueries ?? existing.queries,
137177
layout: mergeLayout(flags, existing),
138178
};

src/commands/dashboard/widget/index.ts

Lines changed: 21 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -20,15 +20,29 @@ export const widgetRoute = buildRouteMap({
2020
" specialized: stacked_area (3×2), top_n (3×2), categorical_bar (3×2), text (3×2)\n" +
2121
" internal: details (3×2), wheel (3×2), rage_and_dead_clicks (3×2),\n" +
2222
" server_tree (3×2), agents_traces_table (3×2)\n\n" +
23-
"Datasets: spans (default), discover, issue, error-events, transaction-like,\n" +
24-
" metrics, logs, tracemetrics, preprod-app-size\n\n" +
23+
"Datasets:\n" +
24+
" spans (default) Span-based queries: span.duration, span.op, transaction,\n" +
25+
" span attributes, cache.hit, etc. Covers most use cases.\n" +
26+
" tracemetrics Custom metrics from Sentry.metrics.distribution/gauge/count.\n" +
27+
" Query format: aggregation(value,metric_name,metric_type,unit)\n" +
28+
" Example: p50(value,completion.duration_ms,distribution,none)\n" +
29+
" Supported displays: line, area, bar, big_number, categorical_bar\n" +
30+
" discover Legacy discover queries (adds failure_rate, apdex, etc.)\n" +
31+
" issue Issue-based queries\n" +
32+
" error-events Error event queries\n" +
33+
" logs Log queries\n\n" +
2534
"Aggregates (spans): count, count_unique, sum, avg, percentile, p50, p75,\n" +
2635
" p90, p95, p99, p100, eps, epm, any, min, max\n" +
2736
"Aggregates (discover adds): failure_count, failure_rate, apdex,\n" +
2837
" count_miserable, user_misery, count_web_vitals, count_if, count_at_least,\n" +
2938
" last_seen, latest_event, var, stddev, cov, corr, performance_score,\n" +
3039
" opportunity_score, count_scores\n" +
3140
"Aliases: spm → epm, sps → eps, tpm → epm, tps → eps\n\n" +
41+
"tracemetrics query format:\n" +
42+
" aggregation(value,metric_name,metric_type,unit)\n" +
43+
" - metric_name: name passed to Sentry.metrics.distribution/gauge/count\n" +
44+
" - metric_type: distribution, gauge, counter, set\n" +
45+
" - unit: none (if unspecified), byte, second, millisecond, ratio, etc.\n\n" +
3246
"Row-filling examples:\n" +
3347
" # 3 KPIs (2+2+2 = 6)\n" +
3448
' sentry dashboard widget add <d> "Error Count" --display big_number --query count\n' +
@@ -41,6 +55,11 @@ export const widgetRoute = buildRouteMap({
4155
' sentry dashboard widget add <d> "Top Endpoints" --display table \\\n' +
4256
" --query count --query p95:span.duration --group-by transaction \\\n" +
4357
" --sort -count --limit 10\n\n" +
58+
" # Custom metrics (tracemetrics dataset)\n" +
59+
' sentry dashboard widget add <d> "Latency" --display line \\\n' +
60+
" --dataset tracemetrics \\\n" +
61+
' --query "p50(value,completion.duration_ms,distribution,none)" \\\n' +
62+
' --query "p90(value,completion.duration_ms,distribution,none)"\n\n' +
4463
"Commands:\n" +
4564
" add Add a widget to a dashboard\n" +
4665
" edit Edit a widget in a dashboard\n" +

0 commit comments

Comments
 (0)