-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Digging through the code, some spots to investigate more:
1. SIMD Binary Detection (High Impact, Low Effort)
File: simd.zig, parallel_walker.zig Problem (parallel_walker.zig:446-450):
for (data[0..check_len]) |byte| {
if (byte == 0) return; // Byte-by-byte for 8KB!
}
Solution: Add containsNul() using SIMD vectors to check 16-32 bytes at a time. Expected gain: 10-15%
2. Eliminate Path Allocations (High Impact, Medium Effort)
File: parallel_walker.zig Problem (line 393):
const full_path = std.fs.path.join(alloc, &.{ work.path, entry.name }) catch continue;
Every file/directory entry causes a heap allocation. Solution: Use stack-allocated path buffer with FixedBufferAllocator:
var path_buf: [std.fs.max_path_bytes]u8 = undefined;
var fba = std.heap.FixedBufferAllocator.init(&path_buf);
// Use fba.allocator() for path joins, reset between iterations
Expected gain: 10-15%
3. Gitignore Pattern Caching (High Impact, Medium Effort)
Files: gitignore.zig, parallel_walker.zig Problem (parallel_walker.zig:374-376, 235-299): loadParentGitignores() walks up the directory tree and re-parses .gitignore files for EVERY directory processed. Solution: Create a thread-safe GitignoreCache:
Cache parsed gitignore patterns per directory
Store parent inheritance chain
Mutex-protected lookup with lock-free reads for cached entries
pub const GitignoreCache = struct {
mutex: std.Thread.Mutex,
cache: std.StringHashMapUnmanaged(CachedIgnoreState),
pub fn getOrCreate(self: *GitignoreCache, dir_path: []const u8) !*CachedIgnoreState {
// Fast path: check cache
// Slow path: lock, load parents, parse .gitignore, cache
}
};
Expected gain: 15-20%
4. Better Initial Work Distribution (Medium Impact, Low Effort)
File: parallel_walker.zig Problem (lines 136-154): Single root path means only thread 0 gets initial work; others must steal. Solution: Pre-scan root directory to distribute subdirectories across all threads:
if (path_idx == 1 and self.num_threads > 1) {
// Open root dir, distribute subdirs to all thread deques round-robin
}
Expected gain: 5-10%
5. SIMD Case-Insensitive Search (Medium Impact, High Effort)
Files: simd.zig, matcher.zig Problem (matcher.zig:128-147): std.ascii.toLower() called per byte. Solution: SIMD case-folding using bit manipulation:
// ASCII: uppercase = lowercase ^ 0x20 (for letters only)
const case_bit: Vec = @Splat(0x20);
// Check if in A-Z range, then OR with case_bit
Expected gain: 10-20% for -i searches (not applicable to current benchmark)
| Step | Optimization | Files | Est. Gain |
|---|---|---|---|
| 1 | SIMD binary detection | simd.zig, parallel_walker.zig | 10-15% |
| 2 | Stack-allocated path buffers | parallel_walker.zig | 10-15% |
| 3 | Gitignore caching | gitignore.zig, parallel_walker.zig | 15-20% |
| 4 | Initial work distribution | parallel_walker.zig | 5-10% |
| 5 | SIMD case-insensitive search | simd.zig, matcher.zig | 10-20% |
Conservative total estimate: 35-45% improvement Note: Cross-platform implementation only (no Linux-specific syscall optimizations)
| File | Changes |
|---|---|
| src/simd.zig | Add containsNul(), optionally findSubstringIgnoreCase() |
| src/parallel_walker.zig | Path buffer optimization, gitignore cache integration, work distribution, use SIMD binary detection |
| src/gitignore.zig | Add GitignoreCache struct |
| src/matcher.zig | Integrate SIMD case-insensitive (if implementing) |
Another investigation found similar things but a few others:
Gitignore Pattern Caching (Medium Impact)
Problem: loadParentGitignores() in parallel_walker.zig:235-299 walks up the directory tree for every work item. Fix:
Use a thread-safe cache keyed by directory path
Only load each .gitignore once
Optimize ** Pattern Matching (Medium Impact)
Problem: globMatch() in gitignore.zig:72-162 uses recursion for ** patterns (line 92). Fix:
Convert to iterative approach
Cache compiled pattern state
Reduce Memory Allocation (Low Impact)
Problem: _platform_memmove (2.7%) and allocation overhead visible in trace. Fix:
Pre-allocate FileBuffer with reasonable initial capacity
Reuse buffers across files
| Priority | File | Changes |
|---|---|---|
| P0 | src/parallel_walker.zig | Add per-worker output buffers |
| P0 | src/output.zig | Add bulk flush API |
| P1 | src/gitignore.zig | Pattern caching, iterative ** |
| P2 | src/reader.zig | Buffer reuse |