elastic hash: generic library with full API, 1.7x faster than abseil by joshuaisaact · Pull Request #7 · joshuaisaact/elastic-hash

joshuaisaact · 2026-03-27T14:02:34Z

Why

The elastic hash started as a benchmark experiment against Google's abseil flat_hash_map. This PR turns it into a usable generic library with a production-ready API, validated across architectures and competitors.

What

Generic library (`src/elastic_hash.zig`)

ElasticHash(K, V, Context) — works with any key/value type
AutoElasticHash(K, V) — convenience alias with auto-generated hash/eql
Full API: init, deinit, insert (dedup), get, remove, contains, len, clear, getOrPut, iterator
Automatic resize at 87.5% load
29 tests across both files

Miss optimization (`@branchHint(.cold)`)

Added matchEmpty check per probe with @branchHint(.cold) in get()
Same code was tried and reverted 5 times — the cold hint is the difference
Misses went from 4x slower than abseil to tied at 50% load

Validation

Tombstone churn: 500K cycles, no degradation
Mixed workload (40/40/10/10): 2x faster than abseil
Memory: identical to abseil at all capacities
Cross-architecture: holds on both x86 and Apple M4

Results (C++ elastic vs abseil, same g++ compiler, unshuffled)

Load factor sweep (1M elements)

Load	Elastic hit	Abseil hit	Hit ratio	Elastic miss	Abseil miss	Miss ratio
10%	419us	1,734us	4.1x faster	315us	394us	1.3x faster
25%	1,898us	4,743us	2.5x faster	993us	1,454us	1.5x faster
50%	5,684us	9,570us	1.7x faster	3,122us	3,258us	~tied
75%	10,704us	14,157us	1.3x faster	9,397us	5,817us	abseil 1.6x faster
90%	15,333us	17,781us	1.2x faster	25,355us	7,979us	abseil 3.2x faster
99%	19,791us	19,263us	~tied	36,306us	9,319us	abseil 3.9x faster

Elastic hash wins on hits at every load factor up to 90%.

Size sweep (50% load)

Size	Elastic hit	Abseil hit	Hit ratio	Elastic miss	Abseil miss	Miss ratio
16K	26us	30us	~tied	22us	23us	~tied
64K	110us	164us	1.5x faster	92us	108us	1.2x faster
256K	493us	2,150us	4.4x faster	408us	660us	1.6x faster
1M	5,684us	9,570us	1.7x faster	3,122us	3,258us	~tied
4M	37,817us	49,888us	1.3x faster	18,160us	27,430us	1.5x faster

Elastic hash is faster or tied at every size from 16K to 4M. Peak advantage: 4.4x at 256K.

vs boost::unordered_flat_map (Zig elastic, unshuffled, 1M, 50%)

Operation	Elastic Zig	Boost	Ratio
Hit	5,263us	13,315us	2.53x faster
Miss	2,559us	3,384us	1.32x faster
Insert	3,781us	3,896us	~tied

Cross-architecture

Platform	L2 Cache	Hit advantage at 1M 50%
x86 (Linux)	~512KB	~1.7x
Apple M4	~16MB	~1.7x

Why it's faster

Separated dense fingerprint arrays. Fingerprints (1 byte/slot) in a contiguous array, separate from entries (24 bytes/slot). One cache line covers 64 fingerprints. Fewer cache line fetches per probe.

Simpler insert/delete. No growth policy checks, no hashtablez sampling. Tombstone marking vs find-then-erase.

Cold-hinted matchEmpty. Terminates misses early at low-mid load without regressing hits.

Why it's slower on misses at 75%+ load

Buckets are full. matchEmpty can't find empty slots. Tested bloom filters (54% hit regression) and max probe depth tracking (no improvement) — the separated layout makes extra checks expensive. This is architectural.

What we got wrong

"Cache density" — wrong, M4 disproved it
"Tiered layout is the advantage" — controlled experiment showed flat with same fingerprints is equally fast
"Zig compiler makes it 2x faster" — C++ port shows 1.7x with same g++, only 5-10% is compiler
"Abseil is 3.8x faster at 16K" — wrong, bad data from concurrent runs. Clean runs show tied.
Controlled experiment had 2x capacity bug — caught and fixed

Known limitations

Miss lookups 1.6-3.9x slower than abseil at 75%+ load (architectural)
Optimized for 64K-4M elements at 10-50% load
No thread safety
No hash flooding resistance (same as abseil)

References

Full analysis: FINDINGS.md
Paper: Optimal Bounds for Open Addressing Without Reordering (Farach-Colton, Krapivin, Kuszmaul 2025)
Parent: cross-language benchmark: elastic hash 2-3x faster on both x86 and Apple M4 #4, miss optimization + validation: elastic hash beats abseil on every operation #6

This reverts commit a4a2c7a.

This reverts commit 330ddb6.

This reverts commit 18b4ce8.

This reverts commit d42e9ee.

This reverts commit 9530f6c.

This reverts commit 907f4f8.

This reverts commit 6e5b3dc.

This reverts commit 5f62379.

This reverts commit f7328f9.

This reverts commit 8c9b749.

…for Go Rust+ahash is now the closest competitor (1.46x at 50%, 1.09x at 99%). Go swiss.Map improved from 6.27x to 2.28x with pre-allocated strings. Elastic hash still fastest at every load factor against every competitor.

…pecific

Disproves cache-density hypothesis. On M4's 16MB L2 where both fingerprint arrays fit, elastic hash is *faster* relative to abseil than on x86. The win is cache lines touched per probe, not L2 vs L3 spill. Fixed linux-only timers in zig benchmarks for macOS/ARM.

…mpetitors Size sweep at 50% load with shuffled access on Apple M4. Elastic hash beats abseil (1.3-2.4x), Rust hashbrown+ahash (1.2-3x), and Go swiss.Map (1.8-3.5x) at every tested size. The x86 finding that small tables favored abseil does not hold on M4. Added proper build step for shuffled verification benchmark.

7 approaches to explore: branch-hinted early termination, conditional probing, per-bucket max probe depth, bloom filters, tombstone-free deletion, Robin Hood displacement tracking. Previous early termination attempts regressed hits -- this program specifically guards against that.

Add early termination in tier-0 get() via matchEmpty check with @branchHint(.cold). The branch predictor learns that hits never take the early exit, making the check nearly free on the hot path. Previous attempts without the cold hint regressed hits by 10-30%. Before (50% load, 1M, unshuffled): hit=8,404 miss=12,343 After: hit=5,040 miss=2,688 Abseil: hit=9,027 miss=3,047 Misses went from 4x slower than abseil to 13% faster. Also adds per-bucket max_probe_depth tracking (unused by get() for now but available for future experiments) and autobench-miss benchmark.

With the matchEmpty miss optimization, elastic hash now ranks #1 on hit lookup, miss lookup, insert, and delete at 50% load against abseil, Rust hashbrown+ahash, Go swiss.Map, and Go builtin map. Abseil only wins on misses at 75%+ load where tier 0 fills up.

At 50% load, churning through the entire table does not degrade hit or miss performance. Tombstones from deletes get reused by subsequent inserts, keeping the empty slot ratio stable. matchEmpty early termination remains effective under sustained mutation.

40% hit / 40% miss / 10% insert / 10% delete, 1M ops on pre-filled table. Elastic hash sustains 50M ops/sec vs abseil's 25M at 50% load. Advantage holds at 25% (2.2x) and 75% (1.7x).

…h longer keys Memory: elastic hash uses 1.00x abseil's memory at all capacities. The tiered layout distributes slots across tiers but total count is the same. Key lengths: elastic hash is tied at 16 bytes, 1.33x faster at 8 bytes, and scales to 2x faster at 256 bytes. Fingerprint pre-filtering avoids expensive key comparisons on false-positive hash matches.

…a structure Same elastic hash algorithm compiled with g++ (same as abseil) shows hit lookups roughly matching abseil (9,679 vs 8,609 at 50% load). Zig version is 2x faster than both (4,431). Insert/delete advantage IS structural: C++ elastic hash is still 3.9x faster on inserts and 2.4x faster on deletes vs abseil.

Same elastic hash algorithm in Zig, C++, and Rust at 50% load: - Hit lookup: Zig 4,431 / C++ 9,679 / Rust 12,888 / Abseil 8,609 - Insert: Zig 3,564 / C++ 2,644 / Rust 2,945 / Abseil 10,233 - Delete: Zig 1,584 / C++ 2,483 / Rust 4,189 / Abseil 5,875 Hit lookup advantage is clearly Zig/LLVM codegen (comptime unrolling). Insert advantage is clearly data structure (3-4x in all languages).

Fixed both ports to match Zig implementation exactly: - Full batch insertion logic (tryInsertWithLimit, probeLimit, etc.) - Raw pointers for key storage (no bounds-checked slices) - Explicit NEON SIMD intrinsics on ARM - Prefetch on probe 0 Results at 50% load (1M elements, unshuffled): Hit: Zig 5,016 / Rust 10,130 / C++ 9,959 / Abseil 8,712 Miss: Zig 2,535 / Rust 6,073 / C++ 5,991 / Abseil 2,955 Insert: Zig 3,630 / Rust 2,167 / C++ 3,830 / Abseil 10,315 Delete: Zig 1,613 / Rust 2,344 / C++ 2,940 / Abseil 5,888 Confirms: insert/delete advantage is data structure (3-5x in all languages). Hit lookup advantage is Zig-specific codegen.

4 implementations (elastic+linear, elastic+triangular, flat+linear, flat+triangular) compiled with identical g++ flags, same hash, same SIMD, same benchmark harness. Key findings at 1M 50% load: - Hit lookups: elastic ~10% faster than flat. Probing doesn't matter. - Inserts: flat is 2x faster (batch logic is overhead, not advantage) - Miss/delete: roughly tied across all variants. The 2-3x hit lookup advantage seen in Zig is compiler codegen, not data structure. The insert advantage vs abseil was abseil's overhead (growth policy, rehash checks), not elastic's structural win.

… remove Fixes: - Flat implementations no longer get 2x capacity (was halving effective load, massively biasing results) - Elastic remove now tier-0-only (matching Zig implementation) Corrected results (1M, 50% load, 3 runs): - Hit: flat ~9K, elastic ~10K, Zig ~5.1K - Miss: flat ~5.8K, elastic ~6K, Zig ~2.8K - Insert: flat ~1.8K, elastic ~3.9K, Zig ~3.8K - Delete: all C++ ~5-6K, Zig ~1.7K The tiered layout provides no measurable advantage over flat in C++. The Zig advantage (1.8-2x on lookups, 3.5x on deletes) is compiler codegen, not data structure.

Adding abseil-style needsResize() check to every insert (even when resize never triggers) adds 37% overhead at 50% load. Lookups and deletes are unaffected. This partially explains the 2.6x insert advantage vs abseil — simpler insert path matters.

Insert now checks for existing key before inserting — updates value if found, inserts new entry if not. Resize uses insertNew (skips duplicate check) to avoid double-resize during rehash. New tests: - duplicate key updates value (single key, 3 updates) - duplicate keys at scale (500 keys, re-insert all with new values) - resize triggers and preserves data (16 -> 64 elements) - resize with duplicates - resize then delete then re-insert

…on readiness

New API methods on StringElasticHashGrowth: - contains(key) -> bool - len() -> usize - clear() - resets table without freeing memory - getOrPut(key, default) -> { value_ptr, found_existing } - iterator() -> yields all live key-value pairs, skips tombstones 19 tests covering: basic ops, duplicates, resize, clear, getOrPut (new/existing/modify-via-pointer/at-scale), iterator (empty/full/ tombstones/after-clear).

Instead of two passes (search for existing key, then search for empty slot), the insert now does both in one pass. Tracks the first empty slot while scanning for the key. If key found, update. If not, insert into the tracked slot. Insert at 50% load: 7,408 -> 5,609 (24% faster). Now 1.83x faster than abseil with full duplicate handling and resize support.

The fast-path insert wasn't updating max_probe_depth. Fixed. Also tested max_probe_depth for miss early termination — doesn't help at high load because probe depths are near MAX_PROBES anyway. The high-load miss weakness is fundamental: without a Bloom filter (boost's approach), there's no cheap way to terminate misses when buckets are full.

Added per-bucket overflow bloom filter (2 bits from hash bits 40-47). Set on insert when element is displaced from home bucket. Checked in get() to terminate misses early at high load. Result: 54% hit regression at 50% load for marginal miss improvement at 75%. The extra memory load + AND + compare on probe 0 costs more than the miss savings. 25% false positive rate with 2 bits isn't selective enough. Bloom infrastructure remains in the struct (set on insert, cleared on resize/clear) but get() doesn't check it. High-load miss weakness is accepted — the table is optimized for 10-50% load where matchEmpty handles misses effectively.

ElasticHash(K, V, Context) — comptime-parameterized hash table. Context provides hash() and eql(), matching Zig's std.HashMap pattern. AutoContext(K) auto-generates hash/eql for integers, slices, arrays. AutoElasticHash(K, V) is the convenience alias. Full API: init, deinit, insert (with dedup), get, remove, contains, len, clear, getOrPut, iterator. All ported from string_hybrid_growth with the same SIMD, tiered layout, batch insertion, and cold-hinted early termination. 10 tests covering u64 keys, []const u8 keys, [16]u8 keys, duplicates, resize, getOrPut, iterator, and tombstone handling.

coderabbitai · 2026-03-27T14:02:46Z

Important

Review skipped

Too many files!

This PR contains 198 files, which is 48 over the limit of 150.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5e74f2d1-06cc-48c6-8098-e7b7d986c74d

📥 Commits

Reviewing files that changed from the base of the PR and between 3e37318 and d2a5e39.

⛔ Files ignored due to path filters (12)

abseil-strings.log is excluded by !**/*.log
bench-rust/Cargo.lock is excluded by !**/*.lock
bench-v2.log is excluded by !**/*.log
controlled-experiment/results.tsv is excluded by !**/*.tsv
elastic-strings.log is excluded by !**/*.log
results-m4-abseil.log is excluded by !**/*.log
results-m4-elastic.log is excluded by !**/*.log
results-m4-go.log is excluded by !**/*.log
results-m4-rust.log is excluded by !**/*.log
results-v2.tsv is excluded by !**/*.tsv
results-v3.tsv is excluded by !**/*.tsv
results.tsv is excluded by !**/*.tsv

📒 Files selected for processing (198)

.gitignore
AGENTS.md
BENCHMARK-M4.md
FINDINGS.md
README.md
bench-abseil-clang
bench-abseil-strings
bench-abseil-strings.cpp
bench-abseil.cpp
bench-elastic-cpp.cpp
bench-elastic-rust/Cargo.toml
bench-elastic-rust/src/main.rs
bench-keylen-abseil.cpp
bench-mixed-abseil.cpp
bench-realistic
bench-realistic.cpp
bench-rust/Cargo.toml
bench-rust/bench-go/go.mod
bench-rust/src/main.rs
bench-rust/target/.rustc_info.json
bench-rust/target/CACHEDIR.TAG
bench-rust/target/release/.cargo-lock
bench-rust/target/release/.fingerprint/ahash-28d25007c6697d4d/run-build-script-build-script-build
bench-rust/target/release/.fingerprint/ahash-28d25007c6697d4d/run-build-script-build-script-build.json
bench-rust/target/release/.fingerprint/ahash-7cdcab918ea16e7f/dep-lib-ahash
bench-rust/target/release/.fingerprint/ahash-7cdcab918ea16e7f/invoked.timestamp
bench-rust/target/release/.fingerprint/ahash-7cdcab918ea16e7f/lib-ahash
bench-rust/target/release/.fingerprint/ahash-7cdcab918ea16e7f/lib-ahash.json
bench-rust/target/release/.fingerprint/ahash-7d171089a62bd126/build-script-build-script-build
bench-rust/target/release/.fingerprint/ahash-7d171089a62bd126/build-script-build-script-build.json
bench-rust/target/release/.fingerprint/ahash-7d171089a62bd126/dep-build-script-build-script-build
bench-rust/target/release/.fingerprint/ahash-7d171089a62bd126/invoked.timestamp
bench-rust/target/release/.fingerprint/bench-hashbrown-415e0f0b67b371d2/bin-bench-hashbrown
bench-rust/target/release/.fingerprint/bench-hashbrown-415e0f0b67b371d2/bin-bench-hashbrown.json
bench-rust/target/release/.fingerprint/bench-hashbrown-415e0f0b67b371d2/dep-bin-bench-hashbrown
bench-rust/target/release/.fingerprint/bench-hashbrown-415e0f0b67b371d2/invoked.timestamp
bench-rust/target/release/.fingerprint/bench-hashbrown-f2a29ac529dfc934/bin-bench-hashbrown
bench-rust/target/release/.fingerprint/bench-hashbrown-f2a29ac529dfc934/bin-bench-hashbrown.json
bench-rust/target/release/.fingerprint/bench-hashbrown-f2a29ac529dfc934/dep-bin-bench-hashbrown
bench-rust/target/release/.fingerprint/bench-hashbrown-f2a29ac529dfc934/invoked.timestamp
bench-rust/target/release/.fingerprint/cfg-if-d985813f4553fd9f/dep-lib-cfg_if
bench-rust/target/release/.fingerprint/cfg-if-d985813f4553fd9f/invoked.timestamp
bench-rust/target/release/.fingerprint/cfg-if-d985813f4553fd9f/lib-cfg_if
bench-rust/target/release/.fingerprint/cfg-if-d985813f4553fd9f/lib-cfg_if.json
bench-rust/target/release/.fingerprint/getrandom-6df4dbbcffb370b7/build-script-build-script-build
bench-rust/target/release/.fingerprint/getrandom-6df4dbbcffb370b7/build-script-build-script-build.json
bench-rust/target/release/.fingerprint/getrandom-6df4dbbcffb370b7/dep-build-script-build-script-build
bench-rust/target/release/.fingerprint/getrandom-6df4dbbcffb370b7/invoked.timestamp
bench-rust/target/release/.fingerprint/getrandom-7c74517ba22e6184/run-build-script-build-script-build
bench-rust/target/release/.fingerprint/getrandom-7c74517ba22e6184/run-build-script-build-script-build.json
bench-rust/target/release/.fingerprint/getrandom-9596904795c59be3/dep-lib-getrandom
bench-rust/target/release/.fingerprint/getrandom-9596904795c59be3/invoked.timestamp
bench-rust/target/release/.fingerprint/getrandom-9596904795c59be3/lib-getrandom
bench-rust/target/release/.fingerprint/getrandom-9596904795c59be3/lib-getrandom.json
bench-rust/target/release/.fingerprint/libc-1bc3c604e29ae713/run-build-script-build-script-build
bench-rust/target/release/.fingerprint/libc-1bc3c604e29ae713/run-build-script-build-script-build.json
bench-rust/target/release/.fingerprint/libc-67bdb2dd2ef38a9b/dep-lib-libc
bench-rust/target/release/.fingerprint/libc-67bdb2dd2ef38a9b/invoked.timestamp
bench-rust/target/release/.fingerprint/libc-67bdb2dd2ef38a9b/lib-libc
bench-rust/target/release/.fingerprint/libc-67bdb2dd2ef38a9b/lib-libc.json
bench-rust/target/release/.fingerprint/libc-6b81b46a947c48f0/build-script-build-script-build
bench-rust/target/release/.fingerprint/libc-6b81b46a947c48f0/build-script-build-script-build.json
bench-rust/target/release/.fingerprint/libc-6b81b46a947c48f0/dep-build-script-build-script-build
bench-rust/target/release/.fingerprint/libc-6b81b46a947c48f0/invoked.timestamp
bench-rust/target/release/.fingerprint/once_cell-7d17331ba62697a3/dep-lib-once_cell
bench-rust/target/release/.fingerprint/once_cell-7d17331ba62697a3/invoked.timestamp
bench-rust/target/release/.fingerprint/once_cell-7d17331ba62697a3/lib-once_cell
bench-rust/target/release/.fingerprint/once_cell-7d17331ba62697a3/lib-once_cell.json
bench-rust/target/release/.fingerprint/version_check-88e8b2470aacf655/dep-lib-version_check
bench-rust/target/release/.fingerprint/version_check-88e8b2470aacf655/invoked.timestamp
bench-rust/target/release/.fingerprint/version_check-88e8b2470aacf655/lib-version_check
bench-rust/target/release/.fingerprint/version_check-88e8b2470aacf655/lib-version_check.json
bench-rust/target/release/.fingerprint/zerocopy-2c2a0b3df16c71b3/run-build-script-build-script-build
bench-rust/target/release/.fingerprint/zerocopy-2c2a0b3df16c71b3/run-build-script-build-script-build.json
bench-rust/target/release/.fingerprint/zerocopy-6dd092b58363eab7/dep-lib-zerocopy
bench-rust/target/release/.fingerprint/zerocopy-6dd092b58363eab7/invoked.timestamp
bench-rust/target/release/.fingerprint/zerocopy-6dd092b58363eab7/lib-zerocopy
bench-rust/target/release/.fingerprint/zerocopy-6dd092b58363eab7/lib-zerocopy.json
bench-rust/target/release/.fingerprint/zerocopy-9010127d9528e97d/build-script-build-script-build
bench-rust/target/release/.fingerprint/zerocopy-9010127d9528e97d/build-script-build-script-build.json
bench-rust/target/release/.fingerprint/zerocopy-9010127d9528e97d/dep-build-script-build-script-build
bench-rust/target/release/.fingerprint/zerocopy-9010127d9528e97d/invoked.timestamp
bench-rust/target/release/bench-hashbrown
bench-rust/target/release/bench-hashbrown.d
bench-rust/target/release/build/ahash-28d25007c6697d4d/invoked.timestamp
bench-rust/target/release/build/ahash-28d25007c6697d4d/output
bench-rust/target/release/build/ahash-28d25007c6697d4d/root-output
bench-rust/target/release/build/ahash-28d25007c6697d4d/stderr
bench-rust/target/release/build/ahash-7d171089a62bd126/build-script-build
bench-rust/target/release/build/ahash-7d171089a62bd126/build_script_build-7d171089a62bd126
bench-rust/target/release/build/ahash-7d171089a62bd126/build_script_build-7d171089a62bd126.d
bench-rust/target/release/build/getrandom-6df4dbbcffb370b7/build-script-build
bench-rust/target/release/build/getrandom-6df4dbbcffb370b7/build_script_build-6df4dbbcffb370b7
bench-rust/target/release/build/getrandom-6df4dbbcffb370b7/build_script_build-6df4dbbcffb370b7.d
bench-rust/target/release/build/getrandom-7c74517ba22e6184/invoked.timestamp
bench-rust/target/release/build/getrandom-7c74517ba22e6184/output
bench-rust/target/release/build/getrandom-7c74517ba22e6184/root-output
bench-rust/target/release/build/getrandom-7c74517ba22e6184/stderr
bench-rust/target/release/build/libc-1bc3c604e29ae713/invoked.timestamp
bench-rust/target/release/build/libc-1bc3c604e29ae713/output
bench-rust/target/release/build/libc-1bc3c604e29ae713/root-output
bench-rust/target/release/build/libc-1bc3c604e29ae713/stderr
bench-rust/target/release/build/libc-6b81b46a947c48f0/build-script-build
bench-rust/target/release/build/libc-6b81b46a947c48f0/build_script_build-6b81b46a947c48f0
bench-rust/target/release/build/libc-6b81b46a947c48f0/build_script_build-6b81b46a947c48f0.d
bench-rust/target/release/build/zerocopy-2c2a0b3df16c71b3/invoked.timestamp
bench-rust/target/release/build/zerocopy-2c2a0b3df16c71b3/output
bench-rust/target/release/build/zerocopy-2c2a0b3df16c71b3/root-output
bench-rust/target/release/build/zerocopy-2c2a0b3df16c71b3/stderr
bench-rust/target/release/build/zerocopy-9010127d9528e97d/build-script-build
bench-rust/target/release/build/zerocopy-9010127d9528e97d/build_script_build-9010127d9528e97d
bench-rust/target/release/build/zerocopy-9010127d9528e97d/build_script_build-9010127d9528e97d.d
bench-rust/target/release/deps/ahash-7cdcab918ea16e7f.d
bench-rust/target/release/deps/bench_hashbrown-415e0f0b67b371d2
bench-rust/target/release/deps/bench_hashbrown-415e0f0b67b371d2.d
bench-rust/target/release/deps/bench_hashbrown-f2a29ac529dfc934
bench-rust/target/release/deps/bench_hashbrown-f2a29ac529dfc934.d
bench-rust/target/release/deps/cfg_if-d985813f4553fd9f.d
bench-rust/target/release/deps/getrandom-9596904795c59be3.d
bench-rust/target/release/deps/libahash-7cdcab918ea16e7f.rlib
bench-rust/target/release/deps/libahash-7cdcab918ea16e7f.rmeta
bench-rust/target/release/deps/libc-67bdb2dd2ef38a9b.d
bench-rust/target/release/deps/libcfg_if-d985813f4553fd9f.rlib
bench-rust/target/release/deps/libcfg_if-d985813f4553fd9f.rmeta
bench-rust/target/release/deps/libgetrandom-9596904795c59be3.rlib
bench-rust/target/release/deps/libgetrandom-9596904795c59be3.rmeta
bench-rust/target/release/deps/liblibc-67bdb2dd2ef38a9b.rlib
bench-rust/target/release/deps/liblibc-67bdb2dd2ef38a9b.rmeta
bench-rust/target/release/deps/libonce_cell-7d17331ba62697a3.rlib
bench-rust/target/release/deps/libonce_cell-7d17331ba62697a3.rmeta
bench-rust/target/release/deps/libversion_check-88e8b2470aacf655.rlib
bench-rust/target/release/deps/libversion_check-88e8b2470aacf655.rmeta
bench-rust/target/release/deps/libzerocopy-6dd092b58363eab7.rlib
bench-rust/target/release/deps/libzerocopy-6dd092b58363eab7.rmeta
bench-rust/target/release/deps/once_cell-7d17331ba62697a3.d
bench-rust/target/release/deps/version_check-88e8b2470aacf655.d
bench-rust/target/release/deps/zerocopy-6dd092b58363eab7.d
bench-strings-verify
bench-strings-verify.cpp
bench-strings.sh
bench-v2.sh
bench-verify
bench-verify.cpp
bench.sh
build.zig
controlled-experiment/bench_harness.h
controlled-experiment/build.sh
controlled-experiment/common.h
controlled-experiment/impl_elastic_linear
controlled-experiment/impl_elastic_linear.cpp
controlled-experiment/impl_elastic_triangular
controlled-experiment/impl_elastic_triangular.cpp
controlled-experiment/impl_flat_linear
controlled-experiment/impl_flat_linear.cpp
controlled-experiment/impl_flat_triangular
controlled-experiment/impl_flat_triangular.cpp
controlled-experiment/run_all.sh
controlled-experiment/verify
controlled-experiment/verify.cpp
controlled-experiment/verify_harness.h
cross-lang-results.md
insights-v2.md
insights-v3.md
insights.md
miss-optimization-summary.md
program-v2.md
program-v3-verify.md
program-v3.md
program-v4-verify-strings.md
program-v5-cross-lang.md
program-v6-miss-optimization.md
program.md
results-m4-final.md
results.md
src/autobench-churn.zig
src/autobench-growth.zig
src/autobench-keylen.zig
src/autobench-memory.zig
src/autobench-miss.zig
src/autobench-mixed.zig
src/autobench-realistic.zig
src/autobench-strings-verify.zig
src/autobench-strings.zig
src/autobench-verify.zig
src/autobench.zig
src/elastic_hash.zig
src/hybrid.zig
src/string_hybrid.zig
src/string_hybrid_growth.zig
src/verify-strings.zig
verify-capacity
verify-capacity.cpp
verify-hash-cost
verify-hash-cost.cpp
verify-results.md
verify-strings-abseil
verify-strings-results.md
verify-strings.cpp

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch autoresearch/go-elastic

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Added unshuffled C++ elastic hash benchmark at all sizes + abseil at 50% load across sizes. Abseil wins at small tables (<64K) where L1 fits everything and tier overhead dominates. Elastic advantage starts at 256K (2.9x) and holds through 4M (1.2x). Previous FINDINGS.md claimed "slightly ahead at 16K" based on Zig data — the C++ comparison shows abseil is actually 3.8x faster there.

Previous 16K data (abseil 3.8x faster) was from concurrent runs with machine contention. Clean sequential runs show tied at 16K. Elastic C++ wins on hits at every size (16K-4M) and every load (10-90%). Peak advantage: 4.4x at 256K, 4.1x at 10% load. Only weakness: misses at 75%+ load (abseil 1.6-3.9x faster).

joshuaisaact added 30 commits March 23, 2026 01:02

tier-first search order in get() for better cache locality

a4a2c7a

Revert "tier-first search order in get() for better cache locality"

25c5de2

This reverts commit a4a2c7a.

cross-tier prefetching in get() to hide inter-tier latency

330ddb6

Revert "cross-tier prefetching in get() to hide inter-tier latency"

d5d671a

This reverts commit 330ddb6.

remove all prefetching from get() to test if it helps or hurts

f9fc247

add tier-0 probe-0 fast path in get()

18b4ce8

Revert "add tier-0 probe-0 fast path in get()"

446e450

This reverts commit 18b4ce8.

reduce MAX_PROBES from 32 to 24

b48b0a3

reduce MAX_PROBES from 24 to 20

6bbdd09

cap get() tier search to 8 tiers

6ca9ddc

reduce MAX_LOOKUP_TIERS from 8 to 6

d42e9ee

Revert "reduce MAX_LOOKUP_TIERS from 8 to 6"

33c9feb

This reverts commit d42e9ee.

use fixed-size arrays for tier metadata instead of heap slices

9530f6c

Revert "use fixed-size arrays for tier metadata instead of heap slices"

1c52805

This reverts commit 9530f6c.

switch hash to Stafford variant 13 (splitmix64 finalizer)

907f4f8

Revert "switch hash to Stafford variant 13 (splitmix64 finalizer)"

a8629ca

This reverts commit 907f4f8.

switch to linear probing for better cache locality

bbf67f0

add early exit on empty bucket slots in get()

6e5b3dc

Revert "add early exit on empty bucket slots in get()"

e580524

This reverts commit 6e5b3dc.

combined fp+empty SIMD check with per-tier early exit

5f62379

Revert "combined fp+empty SIMD check with per-tier early exit"

ac8b602

This reverts commit 5f62379.

double tier 0 size to concentrate elements for faster lookup

6f57a0a

quadruple tier 0 size (2x capacity / BUCKET_SIZE)

f7328f9

Revert "quadruple tier 0 size (2x capacity / BUCKET_SIZE)"

be1c046

This reverts commit f7328f9.

reduce MAX_LOOKUP_TIERS from 8 to 4 with larger tier 0

710938e

reduce MAX_LOOKUP_TIERS from 4 to 3

3d427e3

reduce MAX_LOOKUP_TIERS from 3 to 2

6188a77

reduce MAX_LOOKUP_TIERS to 1 (tier 0 only)

c9ea0fe

simplify get() to single tier 0 loop, no tier iteration

8c9b749

Revert "simplify get() to single tier 0 loop, no tier iteration"

eaf5bbe

This reverts commit 8c9b749.

joshuaisaact and others added 25 commits March 24, 2026 08:53

add M4 benchmark guide: test whether cache density advantage is x86-s…

3be46ed

…pecific

add miss optimization summary doc

526d1e5

mixed workload benchmark: 2x faster than abseil under realistic churn

16c1b94

40% hit / 40% miss / 10% insert / 10% delete, 1M ops on pre-filled table. Elastic hash sustains 50M ops/sec vs abseil's 25M at 50% load. Advantage holds at 25% (2.2x) and 75% (1.7x).

add comprehensive findings doc

5c64951

update findings with growth policy overhead, Go results, and producti…

f44dad2

…on readiness

joshuaisaact added 4 commits March 27, 2026 14:23

remove Go benchmarks — not relevant for systems-level comparison

d052a9c

remove Go references from FINDINGS.md

53f8258

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

elastic hash: generic library with full API, 1.7x faster than abseil#7

elastic hash: generic library with full API, 1.7x faster than abseil#7
joshuaisaact wants to merge 220 commits intomainfrom
autoresearch/go-elastic

joshuaisaact commented Mar 27, 2026 •

edited

Loading

Uh oh!

coderabbitai bot commented Mar 27, 2026 •

edited

Loading

Review skipped

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

joshuaisaact commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why

What

Generic library (src/elastic_hash.zig)

Miss optimization (@branchHint(.cold))

Validation

Results (C++ elastic vs abseil, same g++ compiler, unshuffled)

Load factor sweep (1M elements)

Size sweep (50% load)

vs boost::unordered_flat_map (Zig elastic, unshuffled, 1M, 50%)

Cross-architecture

Why it's faster

Why it's slower on misses at 75%+ load

What we got wrong

Known limitations

References

Uh oh!

coderabbitai bot commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

joshuaisaact commented Mar 27, 2026 •

edited

Loading

Generic library (`src/elastic_hash.zig`)

Miss optimization (`@branchHint(.cold)`)

coderabbitai bot commented Mar 27, 2026 •

edited

Loading