-
Notifications
You must be signed in to change notification settings - Fork 15
Description
Are there any plans to introduce the Sliced ELL format?
The format is described in,
Kreutzer, M., Hager, G., Wellein, G., Fehske, H., & Bishop, A. R. (2014). A unified sparse matrix data format for efficient general sparse matrix-vector multiplication on modern processors with wide SIMD units. SIAM Journal on Scientific Computing, 36(5), C401-C423. https://doi.org/10.1137/130930352
This format is supported in NVPL and cuSPARSE.
There is also a second similar format named SELL-P (P for padded), introduced in
Anzt, H., Tomov, S., & Dongarra, J. (2014). Implementing a Sparse Matrix Vector Product for the SELL-C/SELL-C-σ formats on NVIDIA GPUs. University of Tennessee, Tech. Rep. ut-eecs-14-727. https://library.eecs.utk.edu/files/ut-eecs-14-727.pdf
The difference is the the SELL-P format adds padding also in the row direction, to make it a multiple of the hardware units (SIMD lanes, GPU threads). In that sense a SELL-P matrix is always a valid SELL-C matrix, just the slice offsets are different.
I imagine the block sizes could be constrained to: 4, 8, 16, 32 and 64.