-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
Memory reduction was reduced significantly in c09579a.
The per-thread ArenaAllocators were accumulating memory throughout execution since arena allocations are never individually freed. Adding arena.reset(.retain_capacity) after processing each directory reclaims temporary allocations (paths, gitignore state) while keeping the backing pages to avoid syscall overhead.
| State | Peak Memory | vs Baseline | vs ripgrep |
|---|---|---|---|
| Baseline (before) | ~252 MB | - | 16x more |
| After arena reset | ~40-48 MB | ~5x reduction | ~3x more |
| ripgrep | ~16 MB | - | - |
Some things that were explored that didn't help:
- GPA for WorkItems - Added lock contention, similar memory, slightly slower
- Inline path buffer - Still rounds up to page size, no improvement
- Lower mmap threshold - Already optimized in 1434a83, would probably hurt performance (could further investigate the most optimal thread count in more cases)
Some other things to explore based on the remaining ~3x gap vs ripgrep:
- WorkItem allocations: Using page_allocator (4KB minimum per allocation) for ~6000 directories
- Work-stealing deque buffers: Pre-allocated per thread.
These could be optimized with a custom memory pool, but the complexity wasn't worth it for a first pass. Look more in to this and other options.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels