-
Notifications
You must be signed in to change notification settings - Fork 188
Issues with Core 2 Duo #76
Description
First of all, Penryn lacks the rdtscp instruction. It can use rdtsc instead. Otherwise, it gets a bad instruction issue on the benchmark. Despite this, it seems the benchmark is nonfunctional anyways. :(
In addition, HighwayHash64 seems excessively slow on my (admittedly old) chip compared to other hashes.
xxhsum benchmark (100 KB)
gcc 8.2.0 gcc-8 -O2 -march=native
MacBook (13-inch, Mid 2009)/Macbook5,2
2.13 GHz Intel Core 2 Duo (Penryn, SSE4.1, P7450)
macOS 10.13.6 with High Sierra Patcher
4 GB RAM
| Hash | Aligned | Unaligned |
|---|---|---|
| XXH32 | 3912.6 MB/s | 2985.9 MB/s |
| XXH64 | 4004.1 MB/s | 2891.6 MB/s |
XXH32a (two vector_size(16) lanes) |
4970.8 MB/s | 3144.7 MB/s |
XXH64a (two vector_size(16) lanes) |
4935.6 MB/s | 3152.1 MB/s |
| FarmHash32 | 5654.1 MB/s | 3619.6 MB/s |
| FarmHash64 | 6092.9 MB/s | 4197.5 MB/s |
| HighwayHash64 (SSE4.1) | 2462.1 MB/s | 1998.7 MB/s |
| HighwayHash64 (Portable) | 290.4 MB/s | 289.2 MB/s |
| HighwayHash64 (C) | 451.4 MB/s | 435.6 MB/s |
| SpookyHash v2 | 6349.3 MB/s | 3720.1 MB/s |
Note that the Core 2 Duo has a slow multiplier, which takes twice as many cycles as it does for newer Intels. It is the main slowdown for the xxHash family, as replacing multiplies with xors gets it to the upper 5700s (it is ineffective as a hash, though). It also doesn't seem to have fast 64x2 vectors. GCC appears to do operations with 2 32-bit lanes, which is another slowdown.
I mostly want to bring this to attention, because I definitely was disappointed after the effort to make it compile.