Skip to content

Conversation

@hrushitfujitsu
Copy link

@hrushitfujitsu hrushitfujitsu commented Dec 17, 2025

The current CPU backend does not include support for Arm SVE. This PR adds SVE to the kernel-builder to enable and validate SVE support at cmake configuration phase in CPU kernels.

This is done to support SVE implementation of mamba sequential scan algorithm: huggingface/transformers#38185

Comment on lines +46 to +49
check_for_sve(HAVE_SVE)
if(HAVE_SVE)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -march=armv8.2-a+sve")
endif()
Copy link
Member

@danieldk danieldk Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will end up compiling all source files with armv8.2-a+sve if the compiler supports it, which means CPU kernels will fail on all ARM64 devices without SVE.

I think the best way to tackle this is to have some code with the default flags that does a CPU feature check and dispatches to SVE/non-SVE paths (even if the non-SVE path just raises an error message). The kernel with the SVE path can then be compiled with the -march=armv8.2-a+sve flag.

Here is a similar case where a kernel is compiled to work with AVX512 and non-AVX512:

https://github.com/huggingface/kernels-community/blob/04a14c8356fa6020746ef47d1fd63ac4c7b5978d/rmsnorm/build.toml#L8
https://github.com/huggingface/kernels-community/blob/04a14c8356fa6020746ef47d1fd63ac4c7b5978d/rmsnorm/build.toml#L19

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants