-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Description
NVIDIA TensorRT supports a similar sparse mode as in the author's paper on the latest Ampere architecture, their actual speedup is very poor, and the speedup is only observable on sizes larger than 1024*1024。
The author's paper does not use hardware instruction set support similar to NVIDIA, but only uses handwritten KERNEL to achieve a greater speedup ratio than the NVIDIA`s paper, so I think the author should release the source code to respond to everyone's doubts
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels