Skip to content

Add scalar reduction codegen schedule#1284

Open
Yancey0623 wants to merge 10 commits intomainfrom
scalar_reduction
Open

Add scalar reduction codegen schedule#1284
Yancey0623 wants to merge 10 commits intomainfrom
scalar_reduction

Conversation

@Yancey0623
Copy link
Collaborator

@Yancey0623 Yancey0623 commented Mar 1, 2024

add scalar-reduction codegen template , the algorithm comes from https://developer.download.nvidia.com/assets/cuda/files/reduction.pdf

benchmark with PyTorch:

$bsx$seqlenx151936 disc Pytorch
2x768x151936xf32 0.53 ms 0.55ms
2x1024x151936xf32 0.67 ms 0.7 ms
2x2048x151936xf32 1.38 ms 1.4 ms

@Yancey0623 Yancey0623 changed the title [WIP]support scalar reduction support scalar reduction Mar 8, 2024
eedalong
eedalong previously approved these changes Mar 12, 2024
Copy link
Collaborator

@eedalong eedalong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Yancey0623 Yancey0623 changed the title support scalar reduction Add scalar reduction codegen schedule Mar 20, 2024
@eedalong eedalong self-requested a review March 22, 2024 02:08
eedalong
eedalong previously approved these changes Mar 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

Comments