Memory usage improvements by qkaiser · Pull Request #1439 · onekey-sec/unblob

qkaiser · 2026-03-18T14:15:36Z

This branch contains multiple optimizations around unblob's memory consumption, specifically RSS usage when used on Linux.

unblob makes heavy use of mmap'ed files through the File class but always defaults to the MADV_NORMAL policy. This policy applies moderate read-ahead and is conservative about reclaiming pages, which can cause RSS to grow to the full size of large input files.

While this is generally okay on analysts machines, this can become problematic on constrained environment where parallel extractions are competing for memory to the risk of being killed by OOM.

We address this in two ways:

use MADV_SEQUENTIAL when scanning file with HyperScan. Since we expect pages to be read in sequential order, pages in the given range can be aggressively read ahead, and may be freed soon after they are accessed.
explicitly set MADV_DONTNEED on pages that were just read through File.read(), it's a generic approach so that every unblob component reading from a File can take advantage of it.

On top of that, we noticed a syscall optimization around File.size(), avoiding millions of fstat syscalls per run on complex firmware.

On a 7GB gzip multi-chunk stream containing a 14GB extfs filesystem, the initial consumption observed by memray follows this trend:

With the modifications in this branch, we end up with this trend instead:

Since memray was used to perform memory profiling, I added it as a profiling dev dependency.

I would also like to point out that these optimizations were already spotted years ago. At the time we chose to move to streaming mode but never implemented the madvise calls. See #477 (comment)

pyproject.toml

python/unblob/file_utils.py

python/unblob/finder.py

vlaci · 2026-03-19T09:26:10Z

I don't question the decrease in memory usage by this MR, I just don't really understand how each part affects it in isolation. Also it remains a question for me how aggressive we need to be reclaiming memory. Also how does the memory usage profile change between before and after this change when under memory pressure?

qkaiser · 2026-03-19T09:34:02Z

I don't question the decrease in memory usage by this MR

noted

I just don't really understand how each part affects it in isolation.

I'll provide traces before/after each commit so we can make informed decisions.

Also it remains a question for me how aggressive we need to be reclaiming memory.

I'd argue we can be pretty aggressive in reclaiming memory as long as the impact on speed is limited.

Also how does the memory usage profile change between before and after this change when under memory pressure?

Do you have ideas on how we can test that ? Any tools that could simulate memory pressure to see how unblob behaves ?

vlaci · 2026-03-19T09:44:45Z

Do you have ideas on how we can test that ? Any tools that could simulate memory pressure to see how unblob behaves ?

When we first experimented with mmap, I don't recall if I tested it only in a memory constrained docker container or a VM, when we did the mmap transition. I was looking at two things: scanning through a file bigger than the available memory works, and that memory gets reclaimed from unblob when a process external to unblob puts additional pressure on the kernel

martonilles · 2026-03-19T12:58:02Z

When a memory area is madvised with DONTNEED, the RSS usage drops:

Note that, when applied to shared mappings, MADV_DONTNEED might not lead to immediate freeing of the pages in the range.  The kernel is free to delay freeing the pages until an appropriate moment.  The resident set size (RSS) of the calling process will be immediately reduced however.

However in case of a mmap-ed memory backed by a file, under memory pressure the kernel can "reclaim" the memory, so this is not causing OOM errors. However in this particular case the process is running in a docker container with a memory limit. In case a docker container (cgroup) memory limit is reached the kernel tries to reclaim and only goes into OOM if after the reclaim it still fails.

So, in normal cases, DONTNEED is just a hint (advise) and should have no actual impact. Visible the RSS drops, but from a kernel standpoint it does not matter. On the other hand the docker memory limit calculation the RSS might matter, however the reclaim should handle it.

martonilles · 2026-03-19T13:02:01Z

On the other hand calling too many madvise could be slow.

I would look at what exactly happens in a docker container+mem limit scenario with mmap-ed files.

martonilles · 2026-03-19T13:40:42Z

Would be nice to double check if indeed the files are mapped readonly and shared

qkaiser · 2026-03-19T14:37:28Z

On the other hand calling too many madvise could be slow.

yes, time impact will be measured. When I talked with @vlaci I explained that the regained syscalls by removing fstat calls could be spent calling madvise instead 🙃

I would look at what exactly happens in a docker container+mem limit scenario with mmap-ed files.

that's the plan, objective is to see if we can actually reproduce OOM

qkaiser · 2026-03-20T08:10:35Z

I have been misled by memray that does not distinguish between RssFile and VmRss (see bloomberg/memray#778).

I'll keep:

the MADVISE_SEQUENTIAL when we initialize the file, so we hint the kernel
the syscall improvement around size() and fstat calls

I've been using memray to perform memory profiling, let's keep it in our dependencies as a custom profiling dependency.

By default, an mmap file uses the MADV_NORMAL policy, which applies moderate read-ahead but is conservative about reclaiming pages. This may cause issues constrainted environments. With MADV_SEQUENTIAL⁰, the kernel: - reads ahead more aggressively, reducing page-fault stalls - reclaims already-scanned pages sooner under memory pressure [0] https://man7.org/linux/man-pages/man2/madvise.2.html

We never append to the backing file, so returning len(self) (the size of the mapping) is always correct. If we were appending to the file, we couldn't access the out-of-mapping range anyway. The main hot path is stream_scan_chunks, which calls file.size() on every loop iteration — once per DEFAULT_BUFSIZE (64 KiB) slice. For a 7 GiB file that is ~114 000 avoided fstat() syscalls per scan.

qkaiser self-assigned this Mar 18, 2026

qkaiser added enhancement New feature or request help wanted Extra attention is needed python Pull requests that update Python code labels Mar 18, 2026

qkaiser force-pushed the memory-improvements branch from 581885e to e53ae99 Compare March 18, 2026 14:20

vlaci reviewed Mar 19, 2026

View reviewed changes

pyproject.toml Show resolved Hide resolved

python/unblob/file_utils.py Outdated Show resolved Hide resolved

python/unblob/file_utils.py Show resolved Hide resolved

python/unblob/finder.py Outdated Show resolved Hide resolved

qkaiser marked this pull request as draft March 19, 2026 16:33

qkaiser force-pushed the memory-improvements branch from e53ae99 to fc87f1e Compare March 20, 2026 08:19

qkaiser marked this pull request as ready for review March 20, 2026 08:19

qkaiser force-pushed the memory-improvements branch from fc87f1e to ec785ab Compare March 20, 2026 08:20

vlaci approved these changes Mar 20, 2026

View reviewed changes

qkaiser added this pull request to the merge queue Mar 20, 2026

qkaiser removed this pull request from the merge queue due to a manual request Mar 20, 2026

qkaiser added 3 commits March 20, 2026 19:04

chore(deps): add memray to profiling dependencies

0db7b2b

I've been using memray to perform memory profiling, let's keep it in our dependencies as a custom profiling dependency.

qkaiser force-pushed the memory-improvements branch from ec785ab to 49158e1 Compare March 20, 2026 18:04

qkaiser enabled auto-merge March 20, 2026 18:04

qkaiser added this pull request to the merge queue Mar 20, 2026

Merged via the queue into main with commit 4c2b661 Mar 20, 2026
23 checks passed

qkaiser deleted the memory-improvements branch March 20, 2026 18:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory usage improvements#1439

Memory usage improvements#1439
qkaiser merged 3 commits intomainfrom
memory-improvements

qkaiser commented Mar 18, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

vlaci commented Mar 19, 2026

Uh oh!

qkaiser commented Mar 19, 2026 •

edited

Loading

Uh oh!

vlaci commented Mar 19, 2026

Uh oh!

martonilles commented Mar 19, 2026

Uh oh!

martonilles commented Mar 19, 2026

Uh oh!

martonilles commented Mar 19, 2026 •

edited

Loading

Uh oh!

qkaiser commented Mar 19, 2026

Uh oh!

qkaiser commented Mar 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

qkaiser commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

vlaci commented Mar 19, 2026

Uh oh!

qkaiser commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vlaci commented Mar 19, 2026

Uh oh!

martonilles commented Mar 19, 2026

Uh oh!

martonilles commented Mar 19, 2026

Uh oh!

martonilles commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

qkaiser commented Mar 19, 2026

Uh oh!

qkaiser commented Mar 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

qkaiser commented Mar 18, 2026 •

edited

Loading

qkaiser commented Mar 19, 2026 •

edited

Loading

martonilles commented Mar 19, 2026 •

edited

Loading