A training free sparse attention implementation. report.
torch >= 2.9.0
pip install -e .
Sparse TFlops: calculated by full attention formula.
| Sage Attn 2++ | MPSA Sparse Attn (0.8) |
|---|---|
| 440 TFlops | 1320 TFlops |
-
simple wan 2.2 14B end-to-end example:
python example.py -
benchmark sparse attn:
python test/test_mpsa.py
Using Wan 2.2 14B model.
| Full Attn | Sparse Attn |
|---|---|
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
| Full Attn | Sparse Attn |
|---|---|
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |













