Skip to content

uniyakcom/yakhash

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

yakhash

Go Version Go Reference Go Report Card License: MIT Lint Test

English | 中文

High-performance non-cryptographic hash library — Go implementation of the xxHash family.

  • Sum64: Classic 64-bit hash, fully compatible with C XXH64
  • Sum3: Modern 64/128-bit hash, ~56 GB/s on amd64 AVX2, arm64 NEON assembly, fully compatible with C XXH3_64bits / XXH3_128bits

All functions are zero heap allocation hot paths; hash values are bit-for-bit identical to C xxHash 0.8.x.


Installation

go get github.com/uniyakcom/yakhash

Performance

Data source: bench_linux_6c12t.txt, generated by bench.sh. Reproduce: ./bench.sh 3s 3 (3-second runs per input size, isolated processes to avoid L3 cache pollution).

Environment: Linux 6.17.0-14-generic x86_64 · Intel Xeon E-2186G @ 3.80GHz · Go 1.25.7

XXH64

Function 4 B 16 B 100 B 1 KB 4 KB 10 MB
Sum64 1.01 GB/s 2.92 GB/s 8.14 GB/s 16.5 GB/s 15.8 GB/s 18.2 GB/s
Sum64Seed 0.416 GB/s 1.39 GB/s 5.03 GB/s 14.6 GB/s 15.2 GB/s 18.2 GB/s
Sum64String 0.862 GB/s 2.39 GB/s 7.27 GB/s 15.1 GB/s 15.4 GB/s 18.1 GB/s
Stream streaming 0.497 GB/s 1.60 GB/s 5.51 GB/s 15.0 GB/s 15.4 GB/s 18.3 GB/s

XXH3-64

Function 4 B 16 B 100 B 1 KB 4 KB 10 MB
Sum3_64 1.14 GB/s 4.56 GB/s 10.5 GB/s 41.1 GB/s 51.2 GB/s 55.4 GB/s
Sum3_64Seed 0.831 GB/s 3.48 GB/s 9.43 GB/s 31.2 GB/s 47.3 GB/s 56.6 GB/s
Sum3_64String 0.860 GB/s 3.65 GB/s 9.43 GB/s 39.7 GB/s 51.2 GB/s 56.6 GB/s
Stream3 streaming 0.315 GB/s 1.16 GB/s 4.74 GB/s 18.8 GB/s 35.3 GB/s 56.5 GB/s

XXH3-128

Function 4 B 16 B 100 B 1 KB 4 KB 10 MB
Sum3_128 0.961 GB/s 3.32 GB/s 8.85 GB/s 34.0 GB/s 48.4 GB/s 56.6 GB/s
Sum3_128Seed 0.729 GB/s 2.32 GB/s 6.30 GB/s 24.4 GB/s 43.5 GB/s 54.7 GB/s
Sum3_128String 0.789 GB/s 2.81 GB/s 7.15 GB/s 28.7 GB/s 46.5 GB/s 54.9 GB/s
Stream3 streaming 0.284 GB/s 1.01 GB/s 4.09 GB/s 16.4 GB/s 33.6 GB/s 54.0 GB/s

XXH3 significantly outperforms XXH64 at ≥1 KB through AVX2 vectorization (accumulate512); for small inputs (≤100 B) the difference is minimal — both are zero-allocation.

Go yakhash vs C xxHash (reference comparison)

C data source: _benchmarks/results/bench_c_linux_6c12t.txt, generated by _benchmarks/bench_c.sh; gcc -O3 -mavx2 -march=native, same machine, same input sizes.

XXH64

Implementation 4 B 16 B 100 B 1 KB 4 KB 10 MB
Go Sum64 1.01 GB/s 2.92 GB/s 8.14 GB/s 16.5 GB/s 15.8 GB/s 18.2 GB/s
C XXH64 0.93 GB/s 3.05 GB/s 8.13 GB/s 15.8 GB/s 16.3 GB/s 16.9 GB/s
Go/C +9% −4% ±0% +4% −3% +8%

XXH3-64

Implementation 4 B 16 B 100 B 1 KB 4 KB 10 MB
Go Sum3_64 1.14 GB/s 4.56 GB/s 10.5 GB/s 41.1 GB/s 51.2 GB/s 55.4 GB/s
C XXH3_64bits 1.53 GB/s 5.05 GB/s 11.4 GB/s 37.3 GB/s 41.8 GB/s 46.8 GB/s
Go/C −25% −10% −8% +10% +22% +18%

XXH3-128

Implementation 4 B 16 B 100 B 1 KB 4 KB 10 MB
Go Sum3_128 0.961 GB/s 3.32 GB/s 8.85 GB/s 34.0 GB/s 48.4 GB/s 56.6 GB/s
C XXH3_128bits 0.79 GB/s 3.28 GB/s 7.43 GB/s 30.0 GB/s 40.8 GB/s 46.7 GB/s
Go/C +22% +1% +19% +13% +19% +21%

API Reference

One-shot Hashing (most common)

Function Description
Sum64(b []byte) uint64 Sum64, seed=0
Sum64Seed(b []byte, seed uint64) uint64 Sum64, custom seed
Sum64String(s string) uint64 Sum64, zero-copy string, seed=0
Sum64SeedString(s string, seed uint64) uint64 Sum64, zero-copy string, custom seed
Sum3_64(b []byte) uint64 Sum3_64, seed=0
Sum3_64Seed(b []byte, seed uint64) uint64 Sum3_64, custom seed
Sum3_64String(s string) uint64 Sum3_64, zero-copy string, seed=0
Sum3_64SeedString(s string, seed uint64) uint64 Sum3_64, zero-copy string, custom seed
Sum3_64Secret(b []byte, secret []byte) (uint64, error) Sum3_64, custom secret (strongest HashDoS protection)
Sum3_128(b []byte) Uint128 Sum3_128, seed=0
Sum3_128Seed(b []byte, seed uint64) Uint128 Sum3_128, custom seed
Sum3_128String(s string) Uint128 Sum3_128, zero-copy string, seed=0
Sum3_128SeedString(s string, seed uint64) Uint128 Sum3_128, zero-copy string, custom seed
Sum3_128Secret(b []byte, secret []byte) (Uint128, error) Sum3_128, custom secret

Secret Utilities

Function Description
GenSecret(seed uint64) [192]byte Derives a 192-byte secret table from a seed

Streaming Interface (implements hash.Hash64)

Function Description
New() *Stream Sum64 streaming, seed=0
NewSeed(seed uint64) *Stream Sum64 streaming, custom seed
New3() *Stream3 Sum3 streaming, seed=0
New3Seed(seed uint64) *Stream3 Sum3 streaming, custom seed
New3Secret(secret []byte) (*Stream3, error) Sum3 streaming, custom secret
(*Stream).WriteString(s string) (int, error) Zero-copy string write for Stream
(*Stream3).WriteString(s string) (int, error) Zero-copy string write for Stream3
(*Stream).ResetSeed(seed uint64) Reset Stream in-place with a new seed
(*Stream3).ResetSeed(seed uint64) Reset Stream3 in-place with a new seed
(*Stream3).Sum64() uint64 Read 64-bit result without consuming state
(*Stream3).Sum128() Uint128 Read 128-bit result without consuming state
(*Stream3).ResetSecret(secret []byte) error Reset in-place and replace secret
(*Stream3).MarshalBinary() ([]byte, error) Snapshot current state; returns ErrMarshalCustomSecret when a custom secret is in use
(*Stream3).UnmarshalBinary([]byte) error Restore state snapshot; secret is reconstructed from seed or kSecret — no plaintext stored

Uint128 Methods

Methods on the Uint128 return type. All methods use big-endian byte order (most significant byte first), compatible with W3C TraceContext, UUID v7, and standard network representation.

Method Description
(Uint128).Bytes() [16]byte Convert to [16]byte, big-endian, zero-alloc
(Uint128).AppendBytes(dst []byte) []byte Append 16 big-endian bytes to dst
(Uint128).Hex() string 32-char lowercase hex string, big-endian
(Uint128).AppendHex(dst []byte) []byte Append 32 hex chars to dst, zero-alloc hot path
(Uint128).String() string Implements fmt.Stringer; same as Hex()
u := yakhash.Sum3_128Seed([]byte("trace"), uint64(time.Now().UnixNano()))

// [16]byte for TraceContext / binary protocols (zero-alloc, chain-friendly)
traceID := u.Bytes()

// Append 16 bytes to an existing buffer
payload = u.AppendBytes(payload)

// 32-char lowercase hex string — HTTP headers, logging, database storage
fmt.Println(u.Hex()) // e.g. "99aa06d3014798d86001c324468d497f"

// Append hex to an existing buffer (zero-alloc hot path)
logLine = u.AppendHex(logLine)

// fmt.Stringer: fmt.Sprintf("%s", u) returns the same hex string
fmt.Println(u)

Usage Scenarios

Scenario 1: map key / short string hashing

Profile: Input already in memory as string, no seed needed.

import "github.com/uniyakcom/yakhash"

// Sum64String reads the underlying string bytes via unsafe — zero-copy, zero-alloc.
func hashKey(key string) uint64 {
    return yakhash.Sum64String(key)
}

// Equivalent (with string→[]byte copy):
func hashKeyBytes(key string) uint64 {
    return yakhash.Sum64([]byte(key))
}

Recommended: Sum64String / Sum3_64String


Scenario 2: Large file / buffer one-shot checksum

Profile: Data fully loaded in memory, maximise throughput (XXH3 outperforms XXH64 at ≥1 KB).

data, _ := os.ReadFile("large.bin")

// XXH3-64 ~56 GB/s on amd64 AVX2
h := yakhash.Sum3_64(data)

// 128-bit for stronger deduplication (collision probability extremely low)
r := yakhash.Sum3_128(data)
fmt.Printf("%016x%016x\n", r.Hi, r.Lo)

Recommended: Sum3_64 / Sum3_128


Scenario 3: HashDoS protection (HTTP routing / hash-table keys)

Background: Attackers craft hash-collision inputs to degrade a hash table to O(n), causing CPU denial of service.

Option A: Seed-based (lightweight)

Suitable for light protection; seed is randomly generated at process startup.

import (
    "crypto/rand"
    "encoding/binary"
    "github.com/uniyakcom/yakhash"
)

var globalSeed uint64

func init() {
    var b [8]byte
    rand.Read(b[:])
    globalSeed = binary.LittleEndian.Uint64(b[:])
}

func hashRoute(path string) uint64 {
    return yakhash.Sum3_64Seed([]byte(path), globalSeed)
}

Recommended: Sum3_64Seed

Option B: Custom secret (strongest protection)

Secret space is 192 bytes (1536 bits) — orders of magnitude harder to brute-force than a 64-bit seed.

import (
    "crypto/rand"
    "github.com/uniyakcom/yakhash"
)

var secret [192]byte

func init() {
    // Generate 192-byte secret using cryptographic randomness
    rand.Read(secret[:])
    // Or derive from a persistent seed for cross-process reproducibility:
    // secret = yakhash.GenSecret(loadSeedFromConfig())
}

func hashKey(key []byte) (uint64, error) {
    return yakhash.Sum3_64Secret(key, secret[:])
}

func hash128(key []byte) (yakhash.Uint128, error) {
    return yakhash.Sum3_128Secret(key, secret[:])
}

Recommended: Sum3_64Secret / Sum3_128Secret + GenSecret


Scenario 4: Streaming (chunked writes / io.Reader)

Profile: Data arrives in batches (network stream, file read in chunks) — cannot hash all at once.

import (
    "io"
    "os"
    "github.com/uniyakcom/yakhash"
)

// Compute XXH3-64 hash of a file
func hashFile(path string) (uint64, error) {
    f, err := os.Open(path)
    if err != nil {
        return 0, err
    }
    defer f.Close()

    d := yakhash.New3()
    if _, err := io.Copy(d, f); err != nil {
        return 0, err
    }
    return d.Sum64(), nil
}

// Get both 64-bit and 128-bit output from the same Stream3 (no duplicate computation)
func hashFileDual(path string) (uint64, yakhash.Uint128, error) {
    f, err := os.Open(path)
    if err != nil {
        return 0, yakhash.Uint128{}, err
    }
    defer f.Close()

    d := yakhash.New3()
    if _, err := io.Copy(d, f); err != nil {
        return 0, yakhash.Uint128{}, err
    }
    return d.Sum64(), d.Sum128(), nil
}

Recommended: New3() + Write + Sum64/Sum128


Scenario 5: Streaming + HashDoS protection

var secret = yakhash.GenSecret(loadSeedFromConfig())

func hashStream(r io.Reader) (uint64, error) {
    d, err := yakhash.New3Secret(secret[:])
    if err != nil {
        return 0, err
    }
    if _, err := io.Copy(d, r); err != nil {
        return 0, err
    }
    return d.Sum64(), nil
}

Recommended: New3Secret + GenSecret


Scenario 6: sync.Pool Stream reuse (high-concurrency, low-latency)

Profile: High-frequency short-data hashing (e.g. per HTTP request) — reuse objects via Pool to eliminate GC pressure.

import (
    "sync"
    "github.com/uniyakcom/yakhash"
)

var digestPool = sync.Pool{
    New: func() any { return yakhash.New3() },
}

func hashRequest(data []byte) uint64 {
    d := digestPool.Get().(*yakhash.Stream3)
    d.Reset()
    d.Write(data)
    h := d.Sum64()
    digestPool.Put(d)
    return h
}

Recommended: New3() + Reset() + sync.Pool


Scenario 7: Persistent secret (cross-process / cross-node consistent hashing)

Profile: Multiple service instances need to produce the same hash for the same key (e.g. distributed cache routing).

// Load a fixed seed from config — shared across nodes
func loadSecret() [192]byte {
    seed := loadUint64FromConfig("hash_seed")
    return yakhash.GenSecret(seed)
}

var sharedSecret = loadSecret()

func routeKey(key string) int {
    h, _ := yakhash.Sum3_64Secret([]byte(key), sharedSecret[:])
    return int(h % uint64(numShards))
}

Recommended: GenSecret + Sum3_64Secret


Scenario 8: Content-addressable storage / deduplication

Profile: Identify data content uniquely by its hash; low collision rate required. The 128-bit variant has a collision probability of ~$2^{-64}$ (random data), suitable for large-scale deduplication.

func contentID(data []byte) string {
    r := yakhash.Sum3_128(data)
    return r.Hex()
}

Recommended: Sum3_128


Scenario 9: State snapshot and restore (resumable upload / distributed collaboration)

Profile: Persist the intermediate state of a streaming hash to storage, then resume writing in the same or a different process. MarshalBinary / UnmarshalBinary implement encoding.BinaryMarshaler / encoding.BinaryUnmarshaler.

Security note: Serialized bytes do not contain the secret in plaintext.

  • Stream3 created via New3 / New3Seed: secret is safely reconstructed from the seed on deserialization — snapshots can be stored freely.
  • Stream3 created via New3Secret / ResetSecret: MarshalBinary returns ErrMarshalCustomSecret and refuses to serialize — a custom secret cannot be reconstructed from state, and forcing serialization would expose the secret in plaintext, so the API blocks it directly.
// Seed-derived secret: snapshot can be stored freely; secret is reconstructed automatically
d := yakhash.New3Seed(0xdeadbeef)
d.Write(firstChunk)

snap, err := d.MarshalBinary() // snapshot, no secret plaintext
if err != nil {
    // If New3Secret / ResetSecret was used, ErrMarshalCustomSecret is returned here
    log.Fatal(err)
}
saveToStorage(snap)

// Later: restore and continue writing
var d2 yakhash.Stream3
if err := d2.UnmarshalBinary(snap); err != nil {
    log.Fatal(err)
}
d2.Write(secondChunk)
result := d2.Sum64()

Recommended: New3Seed + MarshalBinary / UnmarshalBinary


Additional API Examples

The following APIs are not individually demonstrated in the scenarios above.

SeedString variants: zero-copy string + custom seed

Equivalent to the corresponding Seed function but accepting string directly, eliminating the []byte(s) conversion (zero-copy).

// Sum64SeedString
h := yakhash.Sum64SeedString(key, mySeed)
// Equivalent to (with []byte allocation):
// h = yakhash.Sum64Seed([]byte(key), mySeed)

// Sum3_64SeedString
h3 := yakhash.Sum3_64SeedString(key, mySeed)

// Sum3_128SeedString
u := yakhash.Sum3_128SeedString(key, mySeed)
fmt.Println(u.Hex())

Sum3_128Seed: 128-bit hash with custom seed

data := []byte("some data")
u := yakhash.Sum3_128Seed(data, mySeed)
fmt.Println(u.Hex())

WriteString: zero-copy string write to streaming object

d := yakhash.New3()
d.WriteString("hello ")
d.WriteString("world")
fmt.Println(d.Sum64())
// Same result as (but without []byte conversion):
// d.Write([]byte("hello world"))

ResetSeed: switch seed in-place, reuse streaming object

Useful for computing multiple seed variants on the same object without reallocating.

d := yakhash.New3Seed(seed1)
d.Write(data)
h1 := d.Sum64()

d.ResetSeed(seed2) // switch seed, state reinitialised
d.Write(data)
h2 := d.Sum64() // h1 ≠ h2

// Stream (XXH64) also supports ResetSeed:
d64 := yakhash.New()
d64.Write(data)
d64.ResetSeed(seed2)
d64.Write(data)

ResetSecret: replace custom secret in-place

d, _ := yakhash.New3Secret(secret1)
d.Write(data1)
h1 := d.Sum64()

d.ResetSecret(secret2) // replace secret and reset state
d.Write(data2)
h2 := d.Sum64()

New / NewSeed: XXH64 streaming

Use when you need the hash.Hash64 interface with byte-identical compatibility to C XXH64.

d := yakhash.New()           // seed=0
d.Write(data)
fmt.Println(d.Sum64()) // equivalent to yakhash.Sum64(data)

d64 := yakhash.NewSeed(mySeed)
d64.Write(data)
fmt.Println(d64.Sum64()) // equivalent to yakhash.Sum64Seed(data, mySeed)

Selection Guide

Scenario Recommended API Reason
map key (string, short) Sum3_64String Zero-copy, XXH3 fast on small inputs
Large buffer, one-shot Sum3_64 / Sum3_128 Maximum throughput, AVX2 vectorized
HashDoS (lightweight) Sum3_64Seed Simple, 64-bit seed
HashDoS (strongest) Sum3_64Secret + GenSecret 192-byte key space
Streaming / io.Reader New3() + Write Implements hash.Hash64
Streaming + HashDoS New3Secret Streaming + custom secret
High-concurrency reuse New3() + sync.Pool Eliminates GC allocations
Content addressing / dedup Sum3_128 128-bit extremely low collision rate
Cross-process consistent hash GenSecret + Sum3_64Secret Persistent, derivable secret
State snapshot / resumable New3Seed + MarshalBinary Snapshot excludes secret plaintext; custom secret blocks serialization
Requires hash.Hash64 interface New() / New3() Implements stdlib interface

Platform Acceleration

Platform Sum64 Sum3
amd64 (AVX2) Plan9 assembly ✓ Plan9 assembly, AVX2 path ✓
amd64 (SSE2) Plan9 assembly ✓ Plan9 assembly, SSE2 path ✓
arm64 Plan9 assembly ✓ Plan9 assembly, NEON + scalar UMULL ✓
others pure Go fallback pure Go fallback

Set GOFLAGS=-tags=purego to force the pure-Go path on any platform (for testing/debugging).


Correctness

All function outputs are bit-for-bit identical to C xxHash 0.8.x:

  • Sum64XXH64(input, len, 0)
  • Sum64SeedXXH64(input, len, seed)
  • Sum3_64XXH3_64bits(input, len)
  • Sum3_64SeedXXH3_64bits_withSeed(input, len, seed)
  • Sum3_64SecretXXH3_64bits_withSecret(input, len, secret, secretSize)
  • Sum3_128XXH3_128bits(input, len)
  • Sum3_128SeedXXH3_128bits_withSeed(input, len, seed)
  • Sum3_128SecretXXH3_128bits_withSecret(input, len, secret, secretSize)
  • GenSecret(seed)XXH3_generateSecret_fromSeed(buf, seed)

Test vectors come from the official xxHash sanity test vectors; fuzzing additionally verifies streaming/one-shot equivalence.


Thread Safety

  • One-shot functions (Sum64, Sum64String, Sum3_64, Sum3_128, Sum3_64Secret, Sum3_128Secret, GenSecret, etc.) are concurrency-safe (no shared state)
  • Stream and Stream3 instances are not concurrency-safe — use external synchronization or a dedicated instance per goroutine (recommended: sync.Pool)

License

MIT © 2026 uniyak.com

About

High-performance non-cryptographic hash library — Go implementation of the xxHash family.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors