-
Notifications
You must be signed in to change notification settings - Fork 128
[model] feat: Add ascend fused operators for Qwen3VL #323
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[model] feat: Add ascend fused operators for Qwen3VL #323
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces fused operators for Ascend NPUs, focusing on AdamW and RoPE for Qwen3VL models. While the RoPE implementation appears solid, the fused AdamW implementation has a couple of significant issues. Firstly, it contains hardcoded CUDA device calls, which is a critical error for NPU-targeted code and will cause it to fail. Secondly, the implementation of the fused AdamW operator uses a loop over parameters, which negates the performance benefits of fusion. I have provided specific suggestions to address these high-priority issues.
|
Could you provide a visual comparison of accuracy and performance? |
|
code format |
What does this PR do?
This PR is about the ascent fusion operator patch for the qwen3vl model.
Checklist Before Starting
[{modules}] {type}: {description}(This will be checked by the CI){modules}includemisc,ci,config,docs,data,dist,omni,logging,model,optim,ckpt,release,task,perf,ops,parallel,like[ci, data, model]{type}is infeat,fix,refactor,chore,test[BREAKING]to the beginning of the title.[BREAKING][parallel, model] feat: dynamic batchingTest
API and Usage Example
# Add code snippet or script demonstrating how to use thisDesign & Code Changes
Checklist Before Submitting
Important
Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review.
pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always