Sleef #587

andrewkern · 2025-12-14T19:24:53Z

Integrates SLEEF (SIMD Library for Evaluating Elementary Functions) to accelerate exp(), log(), log10(), and log2() operations on float vectors.

Changes:

Add vendored SLEEF inline headers for AVX2 (x86_64) and NEON (ARM64)
Patch hex float literals for C++11 compatibility
Add Windows/MinGW SIMD detection to CMakeLists.txt
Add cross-compilation toolchain for MinGW-w64

Documentation:

simd_benchmarks/SIMD_BUILD_FLAGS.md - build flag interaction
eidos/sleef/SLEEF_HEADER_GENERATION.md - header regeneration instructions
eidos/sleef/generate_avx2_sleef.sh and generate_arm_sleef.sh - generation scripts

Integrate SLEEF (SIMD Library for Evaluating Elementary Functions) to provide vectorized transcendental math functions. SLEEF enables 4-wide AVX2 vectorization for exp(), log(), log10(), and log2() operations. Changes: - Add eidos/sleef/ directory with vendored SLEEF inline headers - Patch SLEEF headers to use decimal floats (C++11 compatibility) - Update eidos_simd.h to use SLEEF when AVX2+FMA is available - Keep existing hand-written SIMD for sqrt, abs, floor, ceil, etc. Architecture support: - AVX2+FMA: 4-wide vectorized transcendentals via SLEEF - ARM NEON: Placeholder for future (scalar fallback for now) - SSE4.2-only: Scalar std::exp/log fallback SLEEF is distributed under the Boost Software License.

Generate sleefinline_advsimd.h from SLEEF 4.0.0 on ARM64 macOS and enable ARM NEON support in sleef_config.h. This provides 2-wide vectorized transcendental functions (exp, log, log10, log2) on Apple Silicon and other ARM64 platforms.

- Update eidos_functions_math.cpp to call SIMD functions when OpenMP disabled - Update SpatialMap::exp() to use SIMD for consistent results - Add command-line override support (-DEIDOS_SLEEF_AVAILABLE=0) for testing - Add SLEEF benchmark script and ARM header generation script Performance improvement on x86_64 AVX2 (1M elements): - exp(): 8.30ms -> 4.05ms (2.1x speedup) - log(): 6.17ms -> 3.37ms (1.8x speedup) - log10(): 10.79ms -> 3.66ms (2.9x speedup) - log2(): 5.81ms -> 3.99ms (1.5x speedup)

- Add AVX2/FMA detection for Windows/MinGW builds in CMakeLists.txt - Create cmake/toolchain-mingw64.cmake for cross-compilation testing - Verified SLEEF compiles and runs correctly on Windows via Wine This enables the same SLEEF-powered exp/log/log10/log2 speedups on Windows that we have on Linux and macOS.

Consolidates all SIMD-related documentation and scripts in one location. Updates SLEEF_HEADER_GENERATION.md with corrected path references.

Adds AVX2 header generation script matching the ARM script style. Updates SLEEF_HEADER_GENERATION.md to reference the script file.

SLEEF and std::exp produce slightly different results at ULP level. When spatial maps reorder data internally, different elements end up in the scalar remainder loop vs the SIMD loop, causing identical() to fail even though both results are numerically correct.

SLEEF headers generated on Linux/GCC unconditionally define SLEEF_FLOAT128_IS_IEEEQP, but __float128 is not supported by Clang/AppleClang. This caused build failures on macos-15-intel. The fix conditionally defines SLEEF_FLOAT128_IS_IEEEQP only when the compiler actually supports __float128 (GCC with __SIZEOF_FLOAT128__). On other compilers, SLEEF falls back to a struct-based Sleef_quad type. Also updates the generation script and documentation.

andrewkern · 2025-12-15T14:59:26Z

one question for you @bhaller, when you get a chance to review this-- are you happy with where the documentation currently lives in simd_benchmarks/?

Another option might be to put simd_benchmarks/SLEEF_HEADER_GENERATION.md and the header patching scripts simd_benchmarks/generate_*_sleef.sh in the same directory as the header files themselves eidos/sleef/

I didn't do this originally to keep it clean, but I'd like this to be as clear as possible to anyone working on this stuff in the future.

bhaller · 2025-12-15T16:19:58Z

one question for you @bhaller, when you get a chance to review this-- are you happy with where the documentation currently lives in simd_benchmarks/?

Another option might be to put simd_benchmarks/SLEEF_HEADER_GENERATION.md and the header patching scripts simd_benchmarks/generate_*_sleef.sh in the same directory as the header files themselves eidos/sleef/

I didn't do this originally to keep it clean, but I'd like this to be as clear as possible to anyone working on this stuff in the future.

I would move them as you suggest, yes. Keep the sleef stuff in eidos/sleef, and keep the simd_benchmarks folder as SIMD benchmarks (whether related to sleef or not). That seems like a clean conceptual division.

bhaller · 2025-12-15T16:37:34Z

OK, I did a quick review. AFAICT this is good to merge as soon as that move, and other minutiae we discussed on Slack, has been done. Ping me when it's ready. Thanks, this is amazing!

andrewkern added 12 commits December 14, 2025 09:29

Add SIMD/SLEEF build flags documentation

d5646d7

Add SLEEF header generation and patching documentation

b304832

Move generate_arm_sleef.sh to simd_benchmarks/ directory

6570baf

Consolidates all SIMD-related documentation and scripts in one location. Updates SLEEF_HEADER_GENERATION.md with corrected path references.

Add generate_avx2_sleef.sh script and update documentation

07a4a4f

Adds AVX2 header generation script matching the ARM script style. Updates SLEEF_HEADER_GENERATION.md to reference the script file.

SIMD documentation

95e2ae4

clean up whitespace issues

070a17b

andrewkern mentioned this pull request Dec 15, 2025

add SLEEF in SLiM for SIMD vectorization of lots of math functions #586

Closed

andrewkern marked this pull request as ready for review December 15, 2025 14:55

document cross-compiling toolchain

750d980

moving and light touch editing SLEEF documentation

1112b10

bhaller merged commit 0094aac into MesserLab:master Dec 15, 2025
17 checks passed

andrewkern deleted the sleef branch December 15, 2025 17:23

andrewkern mentioned this pull request Dec 16, 2025

optimizing spatial interaction kernel calculation -- during kd-tree traversal or no? #589

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Sleef #587

Sleef #587

Uh oh!

andrewkern commented Dec 14, 2025 •

edited

Loading

Uh oh!

andrewkern commented Dec 15, 2025

Uh oh!

bhaller commented Dec 15, 2025

Uh oh!

bhaller commented Dec 15, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Sleef #587

Sleef #587

Uh oh!

Conversation

andrewkern commented Dec 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes:

Documentation:

Uh oh!

andrewkern commented Dec 15, 2025

Uh oh!

bhaller commented Dec 15, 2025

Uh oh!

bhaller commented Dec 15, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

andrewkern commented Dec 14, 2025 •

edited

Loading