miss optimization + validation: elastic hash beats abseil on every operation#6
Open
joshuaisaact wants to merge 7 commits intoautoresearch/cross-language-benchfrom
Open
Conversation
7 approaches to explore: branch-hinted early termination, conditional probing, per-bucket max probe depth, bloom filters, tombstone-free deletion, Robin Hood displacement tracking. Previous early termination attempts regressed hits -- this program specifically guards against that.
Add early termination in tier-0 get() via matchEmpty check with @branchHint(.cold). The branch predictor learns that hits never take the early exit, making the check nearly free on the hot path. Previous attempts without the cold hint regressed hits by 10-30%. Before (50% load, 1M, unshuffled): hit=8,404 miss=12,343 After: hit=5,040 miss=2,688 Abseil: hit=9,027 miss=3,047 Misses went from 4x slower than abseil to 13% faster. Also adds per-bucket max_probe_depth tracking (unused by get() for now but available for future experiments) and autobench-miss benchmark.
With the matchEmpty miss optimization, elastic hash now ranks #1 on hit lookup, miss lookup, insert, and delete at 50% load against abseil, Rust hashbrown+ahash, Go swiss.Map, and Go builtin map. Abseil only wins on misses at 75%+ load where tier 0 fills up.
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
At 50% load, churning through the entire table does not degrade hit or miss performance. Tombstones from deletes get reused by subsequent inserts, keeping the empty slot ratio stable. matchEmpty early termination remains effective under sustained mutation.
40% hit / 40% miss / 10% insert / 10% delete, 1M ops on pre-filled table. Elastic hash sustains 50M ops/sec vs abseil's 25M at 50% load. Advantage holds at 25% (2.2x) and 75% (1.7x).
…h longer keys Memory: elastic hash uses 1.00x abseil's memory at all capacities. The tiered layout distributes slots across tiers but total count is the same. Key lengths: elastic hash is tied at 16 bytes, 1.33x faster at 8 bytes, and scales to 2x faster at 256 bytes. Fingerprint pre-filtering avoids expensive key comparisons on false-positive hash matches.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
Elastic hash's only major weakness was miss lookups -- 2-6x slower than abseil. Five previous attempts to fix this were reverted because they regressed hits by 10-30%. Without competitive miss performance, the hash table couldn't be recommended for real workloads.
What
The fix: 3 lines
Added
matchEmptycheck per probe in tier-0get()with@branchHint(.cold). The cold hint tells the CPU's branch predictor this branch is almost never taken. For hits, the prediction is always correct (~0 cost). For misses, empty slots terminate the search early.This is the same approach tried and reverted 5 times. The only difference is the compiler hint.
Validation suite
Ran four tests to verify the fix holds under realistic conditions:
1. Tombstone churn -- 500K delete/insert cycles at 50% load. No degradation. Tombstones get recycled by inserts.
2. Mixed workload -- 40% hit, 40% miss, 10% insert, 10% delete. Elastic hash: 50M ops/sec vs abseil's 25M. 2x faster.
3. Memory overhead -- Identical to abseil (1.00x) at every capacity.
4. Variable key lengths -- Advantage grows from 1.3x at 8 bytes to 2x at 256 bytes.
Results (M4, 1M elements, 50% load)
Elastic hash is now #1 on every operation at 50% load against abseil, Rust hashbrown+ahash, Go swiss.Map, and Go builtin map.
Where abseil still wins
References
results-m4-final.mdmiss-optimization-summary.mdprogram-v6-miss-optimization.md