Skip to content

Commit d3791ab

Browse files
committed
perf(upgrade): copy-then-mmap with child probe + delta telemetry
Two improvements to the delta upgrade system: 1. **Copy-then-mmap with child process probe for mmap safety** Instead of reading the old binary into a Uint8Array via arrayBuffer() (~100 MB JS heap spike), copies the file to a temp location and probes mmap safety via a child process before attempting it: - Copy: copyFileSync (CoW reflinks on btrfs/xfs/APFS for near-instant) - Probe: spawn child to test Bun.mmap() on the copy — if macOS AMFI sends SIGKILL, only the child dies, parent survives - Mmap: if probe succeeds, mmap the copy (zero JS heap, kernel-managed) - Fallback: if probe fails, read copy via arrayBuffer() (~100 MB heap) 2. **Instrument delta upgrade with Sentry spans and error capture** Delta failures were invisible in Sentry (no spans, no captureException, errors logged at debug level). Now: - withTracingSpan('upgrade.delta') wraps the delta attempt - captureException on failure with warning level + delta context tags - log.warn() instead of log.debug() so users see failures
1 parent 0da9cc9 commit d3791ab

File tree

2 files changed

+197
-30
lines changed

2 files changed

+197
-30
lines changed

src/lib/bspatch.ts

Lines changed: 132 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -5,15 +5,21 @@
55
* TRDIFF10 format (produced by zig-bsdiff with `--use-zstd`). Designed for
66
* minimal memory usage during CLI self-upgrades:
77
*
8-
* - Old binary: loaded via `Bun.file().arrayBuffer()` (~100 MB heap)
8+
* - Old binary: copy-then-mmap for 0 JS heap (CoW on btrfs/xfs/APFS),
9+
* with child-process probe for mmap safety (macOS AMFI sends uncatchable
10+
* SIGKILL on signed Mach-O), falling back to `arrayBuffer()` if unsafe
911
* - Diff/extra blocks: streamed via `DecompressionStream('zstd')`
1012
* - Output: written incrementally to disk via `Bun.file().writer()`
1113
* - Integrity: SHA-256 computed inline via `Bun.CryptoHasher`
1214
*
13-
* Total heap usage: ~100 MB for old file + ~1-2 MB for streaming buffers.
14-
* `Bun.mmap()` is NOT usable here because the old file is the running binary:
15-
* - macOS: AMFI sends uncatchable SIGKILL (PROT_WRITE on signed Mach-O)
16-
* - Linux: ETXTBSY from `open()` with write flags on a running executable
15+
* `Bun.mmap()` cannot target the running binary directly because it opens
16+
* with PROT_WRITE/O_RDWR:
17+
* - macOS: AMFI sends uncatchable SIGKILL (writable mapping on signed Mach-O)
18+
* - Linux: ETXTBSY from `open()` (kernel blocks write-open on running ELF)
19+
*
20+
* The copy-then-mmap strategy sidesteps both: the copy is a regular file
21+
* with no running process, so mmap succeeds. On CoW-capable filesystems
22+
* (btrfs, xfs, APFS) the copy is near-instant with zero extra disk I/O.
1723
*
1824
* TRDIFF10 format (from zig-bsdiff):
1925
* ```
@@ -25,6 +31,10 @@
2531
* ```
2632
*/
2733

34+
import { copyFileSync, unlinkSync } from "node:fs";
35+
import { tmpdir } from "node:os";
36+
import { join } from "node:path";
37+
2838
/** TRDIFF10 header magic bytes */
2939
const TRDIFF10_MAGIC = "TRDIFF10";
3040

@@ -210,12 +220,117 @@ function createZstdStreamReader(compressed: Uint8Array): BufferedStreamReader {
210220
);
211221
}
212222

223+
/** Result of loading the old binary for patching */
224+
type OldFileHandle = {
225+
/** Memory-mapped or in-memory view of the old binary */
226+
data: Uint8Array;
227+
/** Cleanup function to call after patching (removes temp copy, if any) */
228+
cleanup: () => void;
229+
};
230+
231+
/**
232+
* Probe whether `Bun.mmap()` works on a file by spawning a child process.
233+
*
234+
* macOS AMFI sends uncatchable SIGKILL when a process creates a writable
235+
* memory mapping of a code-signed Mach-O binary — even a copy. Since
236+
* SIGKILL terminates the process inside the syscall (before any catch
237+
* block runs), we cannot detect the failure in-process.
238+
*
239+
* This probe spawns a minimal child that attempts `Bun.mmap()` on the
240+
* target file. If the child exits 0, mmap is safe. If it's killed
241+
* (SIGKILL) or exits non-zero, we fall back to `arrayBuffer()`.
242+
*
243+
* @param filePath - Path to the file to probe (should be the temp copy)
244+
* @returns true if mmap is safe on this file, false otherwise
245+
*/
246+
async function probeMmapSafe(filePath: string): Promise<boolean> {
247+
try {
248+
const child = Bun.spawn(
249+
[
250+
process.execPath,
251+
"-e",
252+
`Bun.mmap(${JSON.stringify(filePath)}, { shared: false })`,
253+
],
254+
{ stdout: "ignore", stderr: "ignore" }
255+
);
256+
const exitCode = await child.exited;
257+
return exitCode === 0;
258+
} catch {
259+
return false;
260+
}
261+
}
262+
263+
/**
264+
* Load the old binary for read access during patching.
265+
*
266+
* Strategy: copy to temp file, probe mmap safety via child process,
267+
* then mmap the copy if safe. This avoids `Bun.mmap()` on files that
268+
* would trigger uncatchable SIGKILL (macOS AMFI on signed Mach-O) while
269+
* keeping zero JS heap on platforms where mmap works. On CoW filesystems
270+
* (btrfs, xfs, APFS) the copy is a metadata-only reflink (near-instant).
271+
*
272+
* Falls back to `Bun.file().arrayBuffer()` (~100 MB heap) if the probe
273+
* fails or if copy/mmap errors for any reason.
274+
*/
275+
async function loadOldBinary(oldPath: string): Promise<OldFileHandle> {
276+
const tempCopy = join(tmpdir(), `sentry-patch-old-${process.pid}`);
277+
try {
278+
copyFileSync(oldPath, tempCopy);
279+
280+
// Probe mmap safety in a child process — macOS AMFI sends uncatchable
281+
// SIGKILL on writable mappings of signed Mach-O, even for copies.
282+
const mmapSafe = await probeMmapSafe(tempCopy);
283+
284+
if (mmapSafe) {
285+
const data = Bun.mmap(tempCopy, { shared: false });
286+
return {
287+
data,
288+
cleanup: () => {
289+
try {
290+
unlinkSync(tempCopy);
291+
} catch {
292+
// Best-effort cleanup — OS will reclaim on reboot
293+
}
294+
},
295+
};
296+
}
297+
298+
// mmap probe failed — read copy into JS heap, then clean up temp file
299+
const data = new Uint8Array(await Bun.file(tempCopy).arrayBuffer());
300+
try {
301+
unlinkSync(tempCopy);
302+
} catch {
303+
// Best-effort cleanup
304+
}
305+
return {
306+
data,
307+
cleanup: () => {
308+
// Data is in JS heap — no temp file to clean up
309+
},
310+
};
311+
} catch {
312+
// Copy failed — fall back to reading original directly into JS heap
313+
try {
314+
unlinkSync(tempCopy);
315+
} catch {
316+
// May not exist if copyFileSync failed
317+
}
318+
return {
319+
data: new Uint8Array(await Bun.file(oldPath).arrayBuffer()),
320+
cleanup: () => {
321+
// Data is in JS heap — no temp file to clean up
322+
},
323+
};
324+
}
325+
}
326+
213327
/**
214328
* Apply a TRDIFF10 binary patch with streaming I/O for minimal memory usage.
215329
*
216-
* Reads the old file into memory via `Bun.file().arrayBuffer()`, then streams
217-
* diff/extra blocks (~16 KB buffers) via `DecompressionStream('zstd')`,
218-
* writes output via `Bun.file().writer()`, and computes SHA-256 inline.
330+
* Copies the old file to a temp path and mmaps the copy (0 JS heap), falling
331+
* back to `arrayBuffer()` if mmap fails. Streams diff/extra blocks via
332+
* `DecompressionStream('zstd')`, writes output via `Bun.file().writer()`,
333+
* and computes SHA-256 inline.
219334
*
220335
* @param oldPath - Path to the existing (old) binary file
221336
* @param patchData - Complete TRDIFF10 patch file contents
@@ -246,12 +361,10 @@ export async function applyPatch(
246361
);
247362
const extraReader = createZstdStreamReader(patchData.subarray(extraStart));
248363

249-
// Bun.mmap() is NOT usable for the old file during self-upgrades because
250-
// it always opens with PROT_WRITE, and the old file is the running binary:
251-
// - macOS: AMFI sends uncatchable SIGKILL on writable mapping of signed Mach-O
252-
// - Linux: open() returns ETXTBSY when opening a running executable for write
253-
// Reading into memory costs ~100 MB heap but avoids both platform restrictions.
254-
const oldFile = new Uint8Array(await Bun.file(oldPath).arrayBuffer());
364+
// Load old binary via copy-then-mmap (0 JS heap) or arrayBuffer fallback.
365+
// See loadOldBinary() for why direct mmap of the running binary is impossible.
366+
const { data: oldFile, cleanup: cleanupOldFile } =
367+
await loadOldBinary(oldPath);
255368

256369
// Streaming output: write directly to disk, no output buffer in memory
257370
const writer = Bun.file(destPath).writer();
@@ -300,7 +413,11 @@ export async function applyPatch(
300413
oldpos += seekBy;
301414
}
302415
} finally {
303-
await writer.end();
416+
try {
417+
await writer.end();
418+
} finally {
419+
cleanupOldFile();
420+
}
304421
}
305422

306423
// Validate output size matches header

src/lib/delta-upgrade.ts

Lines changed: 65 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,8 @@
1919

2020
import { unlinkSync } from "node:fs";
2121

22+
import { captureException } from "@sentry/bun";
23+
2224
import {
2325
GITHUB_RELEASES_URL,
2426
getPlatformBinaryName,
@@ -34,6 +36,7 @@ import {
3436
type OciManifest,
3537
} from "./ghcr.js";
3638
import { logger } from "./logger.js";
39+
import { withTracingSpan } from "./telemetry.js";
3740

3841
/** Scoped logger for delta upgrade operations */
3942
const log = logger.withTag("delta-upgrade");
@@ -563,29 +566,76 @@ export async function resolveNightlyChain(
563566
* @param destPath - Path to write the patched binary
564567
* @returns Delta result with SHA-256 and size info, or null if delta is unavailable
565568
*/
566-
export async function attemptDeltaUpgrade(
569+
export function attemptDeltaUpgrade(
567570
targetVersion: string,
568571
oldBinaryPath: string,
569572
destPath: string
570573
): Promise<DeltaResult | null> {
571574
if (!canAttemptDelta(targetVersion)) {
572-
return null;
575+
return Promise.resolve(null);
573576
}
574577

575-
log.debug(`Attempting delta upgrade from ${CLI_VERSION} to ${targetVersion}`);
578+
const channel = isNightlyVersion(targetVersion) ? "nightly" : "stable";
576579

577-
try {
578-
if (isNightlyVersion(targetVersion)) {
579-
return await resolveNightlyDelta(targetVersion, oldBinaryPath, destPath);
580-
}
581-
return await resolveStableDelta(targetVersion, oldBinaryPath, destPath);
582-
} catch (error) {
583-
// Any error during delta upgrade → fall back to full download.
584-
// Log at debug so --verbose reveals the root cause (ETXTBSY, network, etc.)
585-
const msg = error instanceof Error ? error.message : String(error);
586-
log.debug(`Delta upgrade failed (${msg}), falling back to full download`);
587-
return null;
588-
}
580+
return withTracingSpan(
581+
"delta-upgrade",
582+
"upgrade.delta",
583+
async (span) => {
584+
span.setAttribute("delta.from_version", CLI_VERSION);
585+
span.setAttribute("delta.to_version", targetVersion);
586+
span.setAttribute("delta.channel", channel);
587+
588+
log.debug(
589+
`Attempting delta upgrade from ${CLI_VERSION} to ${targetVersion}`
590+
);
591+
592+
try {
593+
const result =
594+
channel === "nightly"
595+
? await resolveNightlyDelta(targetVersion, oldBinaryPath, destPath)
596+
: await resolveStableDelta(targetVersion, oldBinaryPath, destPath);
597+
598+
if (result) {
599+
span.setAttribute("delta.patch_bytes", result.patchBytes);
600+
span.setAttribute("delta.sha256", result.sha256.slice(0, 12));
601+
span.setStatus({ code: 1 }); // OK
602+
} else {
603+
// No patch available — not an error, just unavailable
604+
span.setAttribute("delta.result", "unavailable");
605+
span.setStatus({ code: 1 }); // OK — graceful fallback
606+
}
607+
return result;
608+
} catch (error) {
609+
// Record the error in Sentry so we can see delta failures in telemetry.
610+
// Marked non-fatal: the upgrade continues via full download.
611+
captureException(error, {
612+
level: "warning",
613+
tags: {
614+
"delta.from_version": CLI_VERSION,
615+
"delta.to_version": targetVersion,
616+
"delta.channel": channel,
617+
},
618+
contexts: {
619+
delta_upgrade: {
620+
from_version: CLI_VERSION,
621+
to_version: targetVersion,
622+
channel,
623+
old_binary_path: oldBinaryPath,
624+
},
625+
},
626+
});
627+
628+
const msg = error instanceof Error ? error.message : String(error);
629+
log.warn(
630+
`Delta upgrade failed (${msg}), falling back to full download`
631+
);
632+
span.setAttribute("delta.result", "error");
633+
span.setAttribute("delta.error", msg);
634+
return null;
635+
}
636+
},
637+
{ "delta.channel": channel }
638+
);
589639
}
590640

591641
/**

0 commit comments

Comments
 (0)