Skip to content

Gujing-Ace/SGEMM_Optim

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Optimize SGEMM step by step

This project is a step-by-step guide to optimize the SGEMM step of the SGEMM algorithm.

Note: CUBLAS 中的矩阵是列主序的,因此为了便于对比,下面的 kernel 对应的宏也是基于列主序的,这样方便与 CUBLAS 计算得到的矩阵进行对比。

Kernel 1

About

Optimize SGEMM step by step

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages