Skip to content

abseil-inspired fast string hash: 1.87x at 50% load (up from 1.74x)#5

Open
joshuaisaact wants to merge 192 commits intomainfrom
autoresearch/fast-hash
Open

abseil-inspired fast string hash: 1.87x at 50% load (up from 1.74x)#5
joshuaisaact wants to merge 192 commits intomainfrom
autoresearch/fast-hash

Conversation

@joshuaisaact
Copy link
Copy Markdown
Owner

Why

Our wyhash was 33% slower than abseil's hash for 16-byte strings. This PR ports abseil's fast path for 9-16 byte keys: two overlapping u64 reads + a single 128-bit multiply.

What

Replaced Wyhash.hash(0, key) with a custom fastStringHash in string_hybrid.zig that uses abseil's approach for 9-16 byte keys (the common case for our 16-byte hex keys). Falls back to wyhash for other lengths.

Results (shuffled hit lookup, 1M, 16-byte string keys)

Load Before (wyhash) After (fast hash) Improvement
10% 1.97x 2.19x +11%
50% 1.74x 1.87x +7%
99% 1.36x 1.41x +4%

As predicted, the hash accounts for ~4-7% of the total advantage. The remaining 80-87% is structural (tiered metadata density).

References

- src/string_hybrid.zig: StringElasticHash with wyhash, same tiered architecture
- src/autobench-strings.zig: Zig benchmark with 16-byte hex keys
- bench-abseil-strings.cpp: abseil benchmark with string_view keys
- bench-strings.sh: comparison runner
- build.zig: added string test and benchmark steps
… access

Shuffled hit lookup gaps at 1M (string_view keys, 16-byte hex):
  10%: 1.97x  25%: 1.86x  50%: 1.74x  75%: 1.61x  90%: 1.50x  99%: 1.36x

The advantage is LARGER than u64 keys because:
- Fingerprint filtering saves expensive 16-byte memcmp comparisons
- Both sides pay more for wyhash, but our table structure saves more per probe
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant