Skip to content

Conversation

@LiuXTao
Copy link
Contributor

@LiuXTao LiuXTao commented Oct 26, 2025

This PR provides the bridge for qwen3-next. It needs to be used in conjunction with the Mcore PR(NVIDIA/Megatron-LM#1958) to ensure the correctness of delta_net out_norm.

I have already aligned the forward logits between the hf model and mcore model.

feat: qwen3 next

fix: qwen3next gated conv weight split tp

feat: qwen3next mcore to hf

remove useless log
@ISEEKYAN
Copy link
Owner

Thanks for your support, but this project is being deprecated, I sincerely recommend you to take a look at the Megatron-Bridge which is the official version with long term maintenance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants