Skip to content

Commit 946ecd3

Browse files
committed
perf(upgrade): use copy-then-mmap for zero JS heap during delta patching
The old file (running binary) cannot be mmap'd directly — Bun always opens with O_RDWR, which fails on running executables (macOS: SIGKILL, Linux: ETXTBSY). PR #343 fixed this with arrayBuffer() (~100 MB JS heap). Improve: copy the old binary to a temp file first, then mmap the copy. The copy is a regular file with no running process, so mmap succeeds and the ~100 MB is kernel-managed (no GC pressure). On CoW-capable filesystems (btrfs, xfs, APFS) the copy is a metadata-only reflink — near-instant with zero extra disk I/O. Falls back to arrayBuffer() if copy or mmap fails for any reason.
1 parent 0da9cc9 commit 946ecd3

File tree

1 file changed

+77
-15
lines changed

1 file changed

+77
-15
lines changed

src/lib/bspatch.ts

Lines changed: 77 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -5,15 +5,20 @@
55
* TRDIFF10 format (produced by zig-bsdiff with `--use-zstd`). Designed for
66
* minimal memory usage during CLI self-upgrades:
77
*
8-
* - Old binary: loaded via `Bun.file().arrayBuffer()` (~100 MB heap)
8+
* - Old binary: copy-then-mmap for 0 JS heap (CoW on btrfs/xfs/APFS),
9+
* falling back to `arrayBuffer()` (~100 MB heap) if mmap fails
910
* - Diff/extra blocks: streamed via `DecompressionStream('zstd')`
1011
* - Output: written incrementally to disk via `Bun.file().writer()`
1112
* - Integrity: SHA-256 computed inline via `Bun.CryptoHasher`
1213
*
13-
* Total heap usage: ~100 MB for old file + ~1-2 MB for streaming buffers.
14-
* `Bun.mmap()` is NOT usable here because the old file is the running binary:
15-
* - macOS: AMFI sends uncatchable SIGKILL (PROT_WRITE on signed Mach-O)
16-
* - Linux: ETXTBSY from `open()` with write flags on a running executable
14+
* `Bun.mmap()` cannot target the running binary directly because it opens
15+
* with PROT_WRITE/O_RDWR:
16+
* - macOS: AMFI sends uncatchable SIGKILL (writable mapping on signed Mach-O)
17+
* - Linux: ETXTBSY from `open()` (kernel blocks write-open on running ELF)
18+
*
19+
* The copy-then-mmap strategy sidesteps both: the copy is a regular file
20+
* with no running process, so mmap succeeds. On CoW-capable filesystems
21+
* (btrfs, xfs, APFS) the copy is near-instant with zero extra disk I/O.
1722
*
1823
* TRDIFF10 format (from zig-bsdiff):
1924
* ```
@@ -25,6 +30,10 @@
2530
* ```
2631
*/
2732

33+
import { copyFileSync, unlinkSync } from "node:fs";
34+
import { tmpdir } from "node:os";
35+
import { join } from "node:path";
36+
2837
/** TRDIFF10 header magic bytes */
2938
const TRDIFF10_MAGIC = "TRDIFF10";
3039

@@ -210,12 +219,63 @@ function createZstdStreamReader(compressed: Uint8Array): BufferedStreamReader {
210219
);
211220
}
212221

222+
/** Result of loading the old binary for patching */
223+
type OldFileHandle = {
224+
/** Memory-mapped or in-memory view of the old binary */
225+
data: Uint8Array;
226+
/** Cleanup function to call after patching (removes temp copy, if any) */
227+
cleanup: () => void;
228+
};
229+
230+
/**
231+
* Load the old binary for read access during patching.
232+
*
233+
* Strategy: copy to temp file, then mmap the copy. This avoids `Bun.mmap()`
234+
* on the running binary (SIGKILL on macOS, ETXTBSY on Linux) while keeping
235+
* zero JS heap — the kernel manages the mapped pages. On CoW filesystems
236+
* (btrfs, xfs, APFS) the copy is a metadata-only reflink (near-instant).
237+
*
238+
* Falls back to `Bun.file().arrayBuffer()` (~100 MB heap) if the copy or
239+
* mmap fails for any reason (permissions, disk space, unsupported FS).
240+
*/
241+
async function loadOldBinary(oldPath: string): Promise<OldFileHandle> {
242+
const tempCopy = join(tmpdir(), `sentry-patch-old-${process.pid}`);
243+
try {
244+
copyFileSync(oldPath, tempCopy);
245+
const data = Bun.mmap(tempCopy, { shared: false });
246+
return {
247+
data,
248+
cleanup: () => {
249+
try {
250+
unlinkSync(tempCopy);
251+
} catch {
252+
// Best-effort cleanup — OS will reclaim on reboot
253+
}
254+
},
255+
};
256+
} catch {
257+
// Copy or mmap failed — fall back to reading into JS heap
258+
try {
259+
unlinkSync(tempCopy);
260+
} catch {
261+
// May not exist if copyFileSync failed
262+
}
263+
return {
264+
data: new Uint8Array(await Bun.file(oldPath).arrayBuffer()),
265+
cleanup: () => {
266+
// No temp file to clean up — data is in JS heap
267+
},
268+
};
269+
}
270+
}
271+
213272
/**
214273
* Apply a TRDIFF10 binary patch with streaming I/O for minimal memory usage.
215274
*
216-
* Reads the old file into memory via `Bun.file().arrayBuffer()`, then streams
217-
* diff/extra blocks (~16 KB buffers) via `DecompressionStream('zstd')`,
218-
* writes output via `Bun.file().writer()`, and computes SHA-256 inline.
275+
* Copies the old file to a temp path and mmaps the copy (0 JS heap), falling
276+
* back to `arrayBuffer()` if mmap fails. Streams diff/extra blocks via
277+
* `DecompressionStream('zstd')`, writes output via `Bun.file().writer()`,
278+
* and computes SHA-256 inline.
219279
*
220280
* @param oldPath - Path to the existing (old) binary file
221281
* @param patchData - Complete TRDIFF10 patch file contents
@@ -246,12 +306,10 @@ export async function applyPatch(
246306
);
247307
const extraReader = createZstdStreamReader(patchData.subarray(extraStart));
248308

249-
// Bun.mmap() is NOT usable for the old file during self-upgrades because
250-
// it always opens with PROT_WRITE, and the old file is the running binary:
251-
// - macOS: AMFI sends uncatchable SIGKILL on writable mapping of signed Mach-O
252-
// - Linux: open() returns ETXTBSY when opening a running executable for write
253-
// Reading into memory costs ~100 MB heap but avoids both platform restrictions.
254-
const oldFile = new Uint8Array(await Bun.file(oldPath).arrayBuffer());
309+
// Load old binary via copy-then-mmap (0 JS heap) or arrayBuffer fallback.
310+
// See loadOldBinary() for why direct mmap of the running binary is impossible.
311+
const { data: oldFile, cleanup: cleanupOldFile } =
312+
await loadOldBinary(oldPath);
255313

256314
// Streaming output: write directly to disk, no output buffer in memory
257315
const writer = Bun.file(destPath).writer();
@@ -300,7 +358,11 @@ export async function applyPatch(
300358
oldpos += seekBy;
301359
}
302360
} finally {
303-
await writer.end();
361+
try {
362+
await writer.end();
363+
} finally {
364+
cleanupOldFile();
365+
}
304366
}
305367

306368
// Validate output size matches header

0 commit comments

Comments
 (0)