Skip to content

Commit 7063ac9

Browse files
committed
fix(telemetry): instrument delta upgrade with spans and error capture
Delta upgrade failures were completely invisible in Sentry: - No spans for the delta attempt (only DB queries visible in traces) - Errors caught and logged at debug level (invisible without --verbose) - No captureException — errors never reported to Sentry This made it impossible to diagnose the ETXTBSY/SIGKILL issues (PRs #339, #340, #343) from telemetry alone — they were found through code analysis and local reproduction. Changes: - Wrap attemptDeltaUpgrade in withTracingSpan for a 'upgrade.delta' span - Record delta.from_version, delta.to_version, delta.channel as attributes - On success: record patch_bytes and sha256 prefix - On unavailable (no patch): record delta.result='unavailable' - On error: captureException with warning level + delta context tags, record delta.result='error' and delta.error message on span - Upgrade log.debug to log.warn for failure messages so users see them Now delta failures will appear as: 1. A span in the upgrade trace (with error status + attributes) 2. A warning-level exception in Sentry Issues (with delta context) 3. A visible stderr message to the user
1 parent 946ecd3 commit 7063ac9

File tree

2 files changed

+134
-29
lines changed

2 files changed

+134
-29
lines changed

src/lib/bspatch.ts

Lines changed: 69 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,8 @@
66
* minimal memory usage during CLI self-upgrades:
77
*
88
* - Old binary: copy-then-mmap for 0 JS heap (CoW on btrfs/xfs/APFS),
9-
* falling back to `arrayBuffer()` (~100 MB heap) if mmap fails
9+
* with child-process probe for mmap safety (macOS AMFI sends uncatchable
10+
* SIGKILL on signed Mach-O), falling back to `arrayBuffer()` if unsafe
1011
* - Diff/extra blocks: streamed via `DecompressionStream('zstd')`
1112
* - Output: written incrementally to disk via `Bun.file().writer()`
1213
* - Integrity: SHA-256 computed inline via `Bun.CryptoHasher`
@@ -227,34 +228,88 @@ type OldFileHandle = {
227228
cleanup: () => void;
228229
};
229230

231+
/**
232+
* Probe whether `Bun.mmap()` works on a file by spawning a child process.
233+
*
234+
* macOS AMFI sends uncatchable SIGKILL when a process creates a writable
235+
* memory mapping of a code-signed Mach-O binary — even a copy. Since
236+
* SIGKILL terminates the process inside the syscall (before any catch
237+
* block runs), we cannot detect the failure in-process.
238+
*
239+
* This probe spawns a minimal child that attempts `Bun.mmap()` on the
240+
* target file. If the child exits 0, mmap is safe. If it's killed
241+
* (SIGKILL) or exits non-zero, we fall back to `arrayBuffer()`.
242+
*
243+
* @param filePath - Path to the file to probe (should be the temp copy)
244+
* @returns true if mmap is safe on this file, false otherwise
245+
*/
246+
async function probeMmapSafe(filePath: string): Promise<boolean> {
247+
try {
248+
const child = Bun.spawn(
249+
[
250+
process.execPath,
251+
"-e",
252+
`Bun.mmap(${JSON.stringify(filePath)}, { shared: false })`,
253+
],
254+
{ stdout: "ignore", stderr: "ignore" }
255+
);
256+
const exitCode = await child.exited;
257+
return exitCode === 0;
258+
} catch {
259+
return false;
260+
}
261+
}
262+
230263
/**
231264
* Load the old binary for read access during patching.
232265
*
233-
* Strategy: copy to temp file, then mmap the copy. This avoids `Bun.mmap()`
234-
* on the running binary (SIGKILL on macOS, ETXTBSY on Linux) while keeping
235-
* zero JS heap — the kernel manages the mapped pages. On CoW filesystems
266+
* Strategy: copy to temp file, probe mmap safety via child process,
267+
* then mmap the copy if safe. This avoids `Bun.mmap()` on files that
268+
* would trigger uncatchable SIGKILL (macOS AMFI on signed Mach-O) while
269+
* keeping zero JS heap on platforms where mmap works. On CoW filesystems
236270
* (btrfs, xfs, APFS) the copy is a metadata-only reflink (near-instant).
237271
*
238-
* Falls back to `Bun.file().arrayBuffer()` (~100 MB heap) if the copy or
239-
* mmap fails for any reason (permissions, disk space, unsupported FS).
272+
* Falls back to `Bun.file().arrayBuffer()` (~100 MB heap) if the probe
273+
* fails or if copy/mmap errors for any reason.
240274
*/
241275
async function loadOldBinary(oldPath: string): Promise<OldFileHandle> {
242276
const tempCopy = join(tmpdir(), `sentry-patch-old-${process.pid}`);
243277
try {
244278
copyFileSync(oldPath, tempCopy);
245-
const data = Bun.mmap(tempCopy, { shared: false });
279+
280+
// Probe mmap safety in a child process — macOS AMFI sends uncatchable
281+
// SIGKILL on writable mappings of signed Mach-O, even for copies.
282+
const mmapSafe = await probeMmapSafe(tempCopy);
283+
284+
if (mmapSafe) {
285+
const data = Bun.mmap(tempCopy, { shared: false });
286+
return {
287+
data,
288+
cleanup: () => {
289+
try {
290+
unlinkSync(tempCopy);
291+
} catch {
292+
// Best-effort cleanup — OS will reclaim on reboot
293+
}
294+
},
295+
};
296+
}
297+
298+
// mmap probe failed — read copy into JS heap, then clean up temp file
299+
const data = new Uint8Array(await Bun.file(tempCopy).arrayBuffer());
300+
try {
301+
unlinkSync(tempCopy);
302+
} catch {
303+
// Best-effort cleanup
304+
}
246305
return {
247306
data,
248307
cleanup: () => {
249-
try {
250-
unlinkSync(tempCopy);
251-
} catch {
252-
// Best-effort cleanup — OS will reclaim on reboot
253-
}
308+
// Data is in JS heap — no temp file to clean up
254309
},
255310
};
256311
} catch {
257-
// Copy or mmap failed — fall back to reading into JS heap
312+
// Copy failed — fall back to reading original directly into JS heap
258313
try {
259314
unlinkSync(tempCopy);
260315
} catch {
@@ -263,7 +318,7 @@ async function loadOldBinary(oldPath: string): Promise<OldFileHandle> {
263318
return {
264319
data: new Uint8Array(await Bun.file(oldPath).arrayBuffer()),
265320
cleanup: () => {
266-
// No temp file to clean up — data is in JS heap
321+
// Data is in JS heap — no temp file to clean up
267322
},
268323
};
269324
}

src/lib/delta-upgrade.ts

Lines changed: 65 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,8 @@
1919

2020
import { unlinkSync } from "node:fs";
2121

22+
import { captureException } from "@sentry/bun";
23+
2224
import {
2325
GITHUB_RELEASES_URL,
2426
getPlatformBinaryName,
@@ -34,6 +36,7 @@ import {
3436
type OciManifest,
3537
} from "./ghcr.js";
3638
import { logger } from "./logger.js";
39+
import { withTracingSpan } from "./telemetry.js";
3740

3841
/** Scoped logger for delta upgrade operations */
3942
const log = logger.withTag("delta-upgrade");
@@ -563,29 +566,76 @@ export async function resolveNightlyChain(
563566
* @param destPath - Path to write the patched binary
564567
* @returns Delta result with SHA-256 and size info, or null if delta is unavailable
565568
*/
566-
export async function attemptDeltaUpgrade(
569+
export function attemptDeltaUpgrade(
567570
targetVersion: string,
568571
oldBinaryPath: string,
569572
destPath: string
570573
): Promise<DeltaResult | null> {
571574
if (!canAttemptDelta(targetVersion)) {
572-
return null;
575+
return Promise.resolve(null);
573576
}
574577

575-
log.debug(`Attempting delta upgrade from ${CLI_VERSION} to ${targetVersion}`);
578+
const channel = isNightlyVersion(targetVersion) ? "nightly" : "stable";
576579

577-
try {
578-
if (isNightlyVersion(targetVersion)) {
579-
return await resolveNightlyDelta(targetVersion, oldBinaryPath, destPath);
580-
}
581-
return await resolveStableDelta(targetVersion, oldBinaryPath, destPath);
582-
} catch (error) {
583-
// Any error during delta upgrade → fall back to full download.
584-
// Log at debug so --verbose reveals the root cause (ETXTBSY, network, etc.)
585-
const msg = error instanceof Error ? error.message : String(error);
586-
log.debug(`Delta upgrade failed (${msg}), falling back to full download`);
587-
return null;
588-
}
580+
return withTracingSpan(
581+
"delta-upgrade",
582+
"upgrade.delta",
583+
async (span) => {
584+
span.setAttribute("delta.from_version", CLI_VERSION);
585+
span.setAttribute("delta.to_version", targetVersion);
586+
span.setAttribute("delta.channel", channel);
587+
588+
log.debug(
589+
`Attempting delta upgrade from ${CLI_VERSION} to ${targetVersion}`
590+
);
591+
592+
try {
593+
const result =
594+
channel === "nightly"
595+
? await resolveNightlyDelta(targetVersion, oldBinaryPath, destPath)
596+
: await resolveStableDelta(targetVersion, oldBinaryPath, destPath);
597+
598+
if (result) {
599+
span.setAttribute("delta.patch_bytes", result.patchBytes);
600+
span.setAttribute("delta.sha256", result.sha256.slice(0, 12));
601+
span.setStatus({ code: 1 }); // OK
602+
} else {
603+
// No patch available — not an error, just unavailable
604+
span.setAttribute("delta.result", "unavailable");
605+
span.setStatus({ code: 1 }); // OK — graceful fallback
606+
}
607+
return result;
608+
} catch (error) {
609+
// Record the error in Sentry so we can see delta failures in telemetry.
610+
// Marked non-fatal: the upgrade continues via full download.
611+
captureException(error, {
612+
level: "warning",
613+
tags: {
614+
"delta.from_version": CLI_VERSION,
615+
"delta.to_version": targetVersion,
616+
"delta.channel": channel,
617+
},
618+
contexts: {
619+
delta_upgrade: {
620+
from_version: CLI_VERSION,
621+
to_version: targetVersion,
622+
channel,
623+
old_binary_path: oldBinaryPath,
624+
},
625+
},
626+
});
627+
628+
const msg = error instanceof Error ? error.message : String(error);
629+
log.warn(
630+
`Delta upgrade failed (${msg}), falling back to full download`
631+
);
632+
span.setAttribute("delta.result", "error");
633+
span.setAttribute("delta.error", msg);
634+
return null;
635+
}
636+
},
637+
{ "delta.channel": channel }
638+
);
589639
}
590640

591641
/**

0 commit comments

Comments
 (0)