abseil-inspired fast string hash: 1.87x at 50% load (up from 1.74x)#5
Open
joshuaisaact wants to merge 192 commits intomainfrom
Open
abseil-inspired fast string hash: 1.87x at 50% load (up from 1.74x)#5joshuaisaact wants to merge 192 commits intomainfrom
joshuaisaact wants to merge 192 commits intomainfrom
Conversation
…oad)" This reverts commit b725080.
…load)" This reverts commit 65aee5b.
This reverts commit 93a99b8.
This reverts commit e5abd48.
…lot)" This reverts commit 32dd07a.
- src/string_hybrid.zig: StringElasticHash with wyhash, same tiered architecture - src/autobench-strings.zig: Zig benchmark with 16-byte hex keys - bench-abseil-strings.cpp: abseil benchmark with string_view keys - bench-strings.sh: comparison runner - build.zig: added string test and benchmark steps
… access Shuffled hit lookup gaps at 1M (string_view keys, 16-byte hex): 10%: 1.97x 25%: 1.86x 50%: 1.74x 75%: 1.61x 90%: 1.50x 99%: 1.36x The advantage is LARGER than u64 keys because: - Fingerprint filtering saves expensive 16-byte memcmp comparisons - Both sides pay more for wyhash, but our table structure saves more per probe
…, sizes, despite slower hash
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
Our wyhash was 33% slower than abseil's hash for 16-byte strings. This PR ports abseil's fast path for 9-16 byte keys: two overlapping u64 reads + a single 128-bit multiply.
What
Replaced
Wyhash.hash(0, key)with a customfastStringHashinstring_hybrid.zigthat uses abseil's approach for 9-16 byte keys (the common case for our 16-byte hex keys). Falls back to wyhash for other lengths.Results (shuffled hit lookup, 1M, 16-byte string keys)
As predicted, the hash accounts for ~4-7% of the total advantage. The remaining 80-87% is structural (tiered metadata density).
References
/usr/include/absl/hash/internal/hash.hlines 960-1064