Skip to content

Commit fc2ac72

Browse files
betegonclaudegithub-actions[bot]BYK
authored
feat(telemetry): add agent detection tag for AI coding tools (#687)
## Summary Adds an `agent` tag to telemetry spans when the CLI is invoked by a known AI coding tool. Detection uses two strategies: 1. **Environment variables** (sync, instant) — agents inject these into child processes. Adapted from [Vercel's `@vercel/detect-agent`](https://github.com/vercel/vercel/tree/main/packages/detect-agent) (Apache-2.0). 2. **Process tree walking** (async, non-blocking) — scans parent/grandparent process names for known agent executables. Fires in the background so it never delays CLI startup. ## Supported agents (env vars) | Agent | Env var(s) | |-------|-----------| | Generic override | `AI_AGENT` | | Cursor | `CURSOR_TRACE_ID`, `CURSOR_AGENT` | | Gemini | `GEMINI_CLI` | | Codex | `CODEX_SANDBOX`, `CODEX_CI`, `CODEX_THREAD_ID` | | Antigravity | `ANTIGRAVITY_AGENT` | | Augment | `AUGMENT_AGENT` | | OpenCode | `OPENCODE_CLIENT` | | Claude Code | `CLAUDE_CODE`, `CLAUDECODE` | | Cowork | `CLAUDE_CODE` + `CLAUDE_CODE_IS_COWORK` | | GitHub Copilot | `COPILOT_MODEL`, `COPILOT_ALLOW_ALL` | | Goose | `GOOSE_TERMINAL` | | Amp | `AMP_THREAD_ID` | | Generic fallback | `AGENT` | **Intentionally excluded:** `REPL_ID` (set in all Replit workspaces, not just AI agent sessions) and `COPILOT_GITHUB_TOKEN` (auth credential users may export persistently). New agents can be added with a single `["ENV_VAR", "agent-name"]` line in the `ENV_VAR_AGENTS` map. ## Process tree detection (fallback) When no env var matches, the CLI asynchronously walks the parent process tree (up to 5 levels) looking for known agent executables: `cursor`, `claude`, `goose`, `windsurf`, `amp`, `codex`, `augment`, `opencode`, `gemini` - **Linux**: reads `/proc/<pid>/status` (in-memory filesystem, fast) - **macOS**: uses `ps(1)` with a 500ms timeout, child process unreffed to never block exit - **Windows**: not supported (env var detection still works) ## Detection priority 1. `AI_AGENT` env var (explicit override) 2. Agent-specific env vars (`ENV_VAR_AGENTS` map) 3. Claude Code / Cowork (conditional logic) 4. `AGENT` env var (generic fallback — explicit signal beats heuristic) 5. Process tree walking (async, best-effort — may miss very fast commands) ## Known limitations - **Fast commands**: Process tree detection runs asynchronously. For commands completing in <50ms where no env vars are set, the tag may not be applied before the transaction ends. This is an intentional trade-off — env vars cover the vast majority of agent invocations instantly. - **Windows**: No process tree detection (no `/proc/` or `ps`). Env var detection still works. ## Test plan - [x] 47 unit tests covering env var detection, process tree walking, depth limits, case-insensitive matching, priority ordering, map structure validation, and real `/proc/` reads - [ ] Manual: `CLAUDE_CODE=1 bunx sentry-cli auth status` and verify span has `agent: claude` --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Burak Yigit Kaya <byk@sentry.io>
1 parent a758710 commit fc2ac72

File tree

3 files changed

+617
-1
lines changed

3 files changed

+617
-1
lines changed

src/lib/detect-agent.ts

Lines changed: 242 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,242 @@
1+
/**
2+
* AI agent detection — determines whether the CLI is being driven by
3+
* a specific AI coding agent.
4+
*
5+
* Detection uses two strategies:
6+
* 1. **Environment variables** (sync) — agents inject these into child
7+
* processes. Adapted from Vercel's @vercel/detect-agent (Apache-2.0).
8+
* 2. **Process tree walking** (async) — scan parent/grandparent process
9+
* names for known agent executables. Runs as a non-blocking background
10+
* task so it never delays CLI startup.
11+
*
12+
* To add a new agent, add entries to {@link ENV_VAR_AGENTS} and/or
13+
* {@link PROCESS_NAME_AGENTS}.
14+
*/
15+
16+
import { execFile } from "node:child_process";
17+
import { readFile } from "node:fs/promises";
18+
import { basename } from "node:path";
19+
20+
import { getEnv } from "./env.js";
21+
22+
/**
23+
* Env var → agent name. Checked in insertion order — first match wins.
24+
* Each env var maps directly to the agent that sets it.
25+
*/
26+
export const ENV_VAR_AGENTS = new Map<string, string>([
27+
// Cursor
28+
["CURSOR_TRACE_ID", "cursor"],
29+
["CURSOR_AGENT", "cursor"],
30+
// Gemini CLI
31+
["GEMINI_CLI", "gemini"],
32+
// OpenAI Codex
33+
["CODEX_SANDBOX", "codex"],
34+
["CODEX_CI", "codex"],
35+
["CODEX_THREAD_ID", "codex"],
36+
// Antigravity
37+
["ANTIGRAVITY_AGENT", "antigravity"],
38+
// Augment
39+
["AUGMENT_AGENT", "augment"],
40+
// OpenCode
41+
["OPENCODE_CLIENT", "opencode"],
42+
// Replit — REPL_ID intentionally excluded because it's set in ALL Replit
43+
// workspaces, not just when the AI agent is driving the CLI
44+
// GitHub Copilot — COPILOT_GITHUB_TOKEN intentionally excluded because
45+
// users may export it persistently for auth, causing false positives
46+
["COPILOT_MODEL", "github-copilot"],
47+
["COPILOT_ALLOW_ALL", "github-copilot"],
48+
// Goose
49+
["GOOSE_TERMINAL", "goose"],
50+
// Amp
51+
["AMP_THREAD_ID", "amp"],
52+
]);
53+
54+
/**
55+
* Process executable basename (lowercase) → agent name.
56+
* Used when scanning the parent process tree as a fallback.
57+
*/
58+
export const PROCESS_NAME_AGENTS = new Map<string, string>([
59+
["cursor", "cursor"],
60+
["claude", "claude"],
61+
["goose", "goose"],
62+
["windsurf", "windsurf"],
63+
["amp", "amp"],
64+
["codex", "codex"],
65+
["augment", "augment"],
66+
["opencode", "opencode"],
67+
["gemini", "gemini"],
68+
]);
69+
70+
/** Max levels to walk up the process tree before giving up. */
71+
const MAX_ANCESTOR_DEPTH = 5;
72+
73+
/** Pattern to extract `Name:` from `/proc/<pid>/status`. */
74+
const PROC_STATUS_NAME_RE = /^Name:\s+(.+)$/m;
75+
76+
/** Pattern to extract `PPid:` from `/proc/<pid>/status`. */
77+
const PROC_STATUS_PPID_RE = /^PPid:\s+(\d+)$/m;
78+
79+
/** Pattern to parse `ps -o ppid=,comm=` output: " <ppid> <comm>". */
80+
const PS_PPID_COMM_RE = /^(\d+)\s+(.+)$/;
81+
82+
/** Name + parent PID of a process. */
83+
type ProcessInfo = {
84+
/** Basename of the executable (e.g. "cursor", "bash"). */
85+
name: string;
86+
/** Parent process ID, or 0 if unavailable. */
87+
ppid: number;
88+
};
89+
90+
/**
91+
* Async process info provider signature. Default reads from `/proc/` or `ps(1)`.
92+
* Override via {@link setProcessInfoProvider} for testing.
93+
*/
94+
type ProcessInfoProvider = (pid: number) => Promise<ProcessInfo | undefined>;
95+
96+
let _getProcessInfo: ProcessInfoProvider = getProcessInfoFromOS;
97+
98+
/**
99+
* Override the process info provider. Follows the same pattern as
100+
* {@link setEnv} — call with a mock in tests, reset in `afterEach`.
101+
*
102+
* Pass `getProcessInfoFromOS` to restore the real implementation.
103+
*/
104+
export function setProcessInfoProvider(provider: ProcessInfoProvider): void {
105+
_getProcessInfo = provider;
106+
}
107+
108+
/**
109+
* Detect agent from environment variables only (synchronous, no I/O).
110+
*
111+
* Priority:
112+
* 1. `AI_AGENT` env var — explicit override, any agent can self-identify
113+
* 2. Agent-specific env vars from {@link ENV_VAR_AGENTS}
114+
* 3. Claude Code with Cowork variant (conditional, can't be in the map)
115+
* 4. `AGENT` env var — generic fallback set by Goose, Amp, and others
116+
*
117+
* Returns the agent name string, or `undefined` if no agent is detected.
118+
* For process tree fallback, use {@link detectAgentFromProcessTree} separately.
119+
*/
120+
export function detectAgent(): string | undefined {
121+
const env = getEnv();
122+
123+
// 1. Highest priority: explicit override — any agent can self-identify
124+
const aiAgent = env.AI_AGENT?.trim();
125+
if (aiAgent) {
126+
return aiAgent;
127+
}
128+
129+
// 2. Table-driven env var check (Map iteration preserves insertion order)
130+
for (const [envVar, agent] of ENV_VAR_AGENTS) {
131+
if (env[envVar]) {
132+
return agent;
133+
}
134+
}
135+
136+
// 3. Claude Code / Cowork — requires branching logic, so not in the map
137+
if (env.CLAUDECODE || env.CLAUDE_CODE) {
138+
return env.CLAUDE_CODE_IS_COWORK ? "cowork" : "claude";
139+
}
140+
141+
// 4. Lowest priority: generic AGENT fallback
142+
return env.AGENT?.trim() || undefined;
143+
}
144+
145+
/**
146+
* Walk the ancestor process tree looking for known agent executables.
147+
*
148+
* Fully async — never blocks CLI startup. Starts at the direct parent
149+
* (`process.ppid`) and walks up to {@link MAX_ANCESTOR_DEPTH} levels.
150+
* Stops at PID 1 (init/launchd) or on any read error.
151+
*
152+
* - **Linux**: reads `/proc/<pid>/status` (in-memory filesystem, fast).
153+
* - **macOS**: uses `ps(1)` with a 500ms timeout per invocation.
154+
* - **Windows**: not supported (env var detection still works).
155+
*/
156+
export async function detectAgentFromProcessTree(): Promise<
157+
string | undefined
158+
> {
159+
let pid = process.ppid;
160+
161+
for (let depth = 0; depth < MAX_ANCESTOR_DEPTH && pid > 1; depth++) {
162+
const info = await _getProcessInfo(pid);
163+
if (!info) {
164+
break;
165+
}
166+
167+
const agent = PROCESS_NAME_AGENTS.get(info.name.toLowerCase());
168+
if (agent) {
169+
return agent;
170+
}
171+
172+
pid = info.ppid;
173+
}
174+
175+
return;
176+
}
177+
178+
/**
179+
* Read process name and parent PID for a given PID.
180+
*
181+
* Tries `/proc/<pid>/status` first (Linux, no subprocess overhead),
182+
* falls back to `ps(1)` (macOS and other Unix systems).
183+
* Windows is unsupported — returns `undefined`.
184+
*/
185+
export async function getProcessInfoFromOS(
186+
pid: number
187+
): Promise<ProcessInfo | undefined> {
188+
// Linux: /proc is an in-memory filesystem — fast even though async
189+
try {
190+
const status = await readFile(`/proc/${pid}/status`, "utf-8");
191+
const nameMatch = status.match(PROC_STATUS_NAME_RE);
192+
const ppidMatch = status.match(PROC_STATUS_PPID_RE);
193+
if (nameMatch?.[1] && ppidMatch?.[1]) {
194+
return { name: nameMatch[1].trim(), ppid: Number(ppidMatch[1]) };
195+
}
196+
} catch {
197+
// Not Linux or process is gone — fall through to ps
198+
}
199+
200+
// macOS / other Unix: use ps(1) asynchronously
201+
if (process.platform !== "win32") {
202+
try {
203+
const result = await execFileUnreffed(
204+
"ps",
205+
["-p", String(pid), "-o", "ppid=,comm="],
206+
{ timeout: 500 }
207+
);
208+
const match = result.trim().match(PS_PPID_COMM_RE);
209+
if (match?.[1] && match?.[2]) {
210+
return { name: basename(match[2].trim()), ppid: Number(match[1]) };
211+
}
212+
} catch {
213+
// Process gone, ps not available, or timeout
214+
}
215+
}
216+
}
217+
218+
/**
219+
* Spawn `execFile` with the child process unreffed so it never
220+
* prevents the CLI from exiting. Resolves with stdout on success.
221+
*/
222+
function execFileUnreffed(
223+
cmd: string,
224+
args: readonly string[],
225+
opts: { timeout?: number }
226+
): Promise<string> {
227+
return new Promise((resolve, reject) => {
228+
const child = execFile(
229+
cmd,
230+
args,
231+
{ encoding: "utf-8", ...opts },
232+
(err, stdout) => {
233+
if (err) {
234+
reject(err);
235+
} else {
236+
resolve(stdout);
237+
}
238+
}
239+
);
240+
child.unref();
241+
});
242+
}

src/lib/telemetry.ts

Lines changed: 21 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,9 +19,10 @@ import {
1919
SENTRY_CLI_DSN,
2020
} from "./constants.js";
2121
import { isReadonlyError, tryRepairAndRetry } from "./db/schema.js";
22+
import { detectAgent, detectAgentFromProcessTree } from "./detect-agent.js";
2223
import { getEnv } from "./env.js";
2324
import { ApiError, AuthError, OutputError } from "./errors.js";
24-
import { attachSentryReporter } from "./logger.js";
25+
import { attachSentryReporter, logger } from "./logger.js";
2526
import { getSentryBaseUrl, isSentrySaasUrl } from "./sentry-urls.js";
2627
import { getRealUsername } from "./utils.js";
2728

@@ -522,6 +523,25 @@ export function initSentry(
522523
// Tag whether running in an interactive terminal or agent/CI environment
523524
Sentry.setTag("is_tty", !!process.stdout.isTTY);
524525

526+
// Tag which AI agent (if any) is driving the CLI.
527+
// Env var detection is sync (instant). If no env var matches, fire off
528+
// async process tree detection in the background — it sets the tag
529+
// before the transaction finishes without blocking CLI startup.
530+
const agent = detectAgent();
531+
if (agent) {
532+
Sentry.setTag("agent", agent);
533+
} else {
534+
detectAgentFromProcessTree()
535+
.then((processAgent) => {
536+
if (processAgent) {
537+
Sentry.setTag("agent", processAgent);
538+
}
539+
})
540+
.catch((error) => {
541+
logger.withTag("agent").warn("Process tree detection failed:", error);
542+
});
543+
}
544+
525545
// Wire up consola → Sentry log forwarding now that the client is active
526546
attachSentryReporter();
527547

0 commit comments

Comments
 (0)