-
Notifications
You must be signed in to change notification settings - Fork 3.7k
add --overlap-param-gather support for layer-wise optimizer. lots of unit tests. #3524
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
deepakn94
merged 34 commits into
NVIDIA:main
from
mchrzanowski:overlap-param-gather-muon-rebased
Mar 5, 2026
+1,175
−116
Merged
Changes from all commits
Commits
Show all changes
34 commits
Select commit
Hold shift + click to select a range
e317e8d
add --overlap-param-gather support for layer-wise optimizer (muon)
mchrzanowski d8189e4
Fix NaN in overlap-param-gather for layer-wise optimizer (Muon)
mchrzanowski 2df43c9
Add unit tests for overlap-param-gather in layer-wise optimizer
mchrzanowski fa0ceb9
Remove use_layer_wise_optimizer from DDP config
mchrzanowski bbed683
Add comments explaining overlap_param_gather replacing use_layer_wise…
mchrzanowski 97fba8b
Merge branch 'main' into overlap-param-gather-muon-rebased
mchrzanowski 46bc317
Run autoformat (black, isort) on changed files
mchrzanowski 7027091
Merge branch 'main' into overlap-param-gather-muon-rebased
mchrzanowski 0c6a632
Merge branch 'main' into overlap-param-gather-muon-rebased
mchrzanowski 3e88f38
Free overlap param-gather buffers before async checkpoint save to fix…
mchrzanowski 3bf616c
Add unit tests for free_overlap_buffers
mchrzanowski f111804
Replace all_gather with per-rank broadcasts for layer-wise param gather
mchrzanowski 211a0ca
Autoformat: black formatting fixes
mchrzanowski c7ee958
Remove assertions blocking overlap_grad_reduce and overlap_param_gath…
mchrzanowski cbed167
Fix dtype mismatch in layer-wise param gather broadcasts causing NCCL…
mchrzanowski 7726a35
Fix timing-dependent NCCL deadlock in layer-wise param gather by wait…
mchrzanowski 1b0f0b6
Merge branch 'main' into overlap-param-gather-muon-rebased
mchrzanowski 5253153
Add missing regression tests for layer-wise optimizer overlap param g…
mchrzanowski 189428b
Autoformat: black formatting fixes
mchrzanowski d07bb57
Switch layer-wise optimizer tests from adam to dist_muon
mchrzanowski 3a0afd1
Add back overlap assertions for muon (but not dist_muon)
mchrzanowski 97b9916
Remove redundant assert in finish_param_sync
mchrzanowski e5a53a2
Replace per-rank broadcasts with all_gather for layer-wise param gather
mchrzanowski be0d438
Merge branch 'main' into overlap-param-gather-muon-rebased
mchrzanowski 6655a10
Merge branch 'main' into overlap-param-gather-muon-rebased
mchrzanowski 87df1dd
Merge branch 'main' into overlap-param-gather-muon-rebased
mchrzanowski a566a7b
Skip --overlap-grad-reduce requirement for layer-wise optimizers
mchrzanowski f30914e
Rename lw -> layerwise, improve docstring, clear handles in wait()
mchrzanowski 23c4516
Fix tests to use renamed layerwise_ attribute prefix
mchrzanowski d77abd5
Re-enable --overlap-grad-reduce assertion for --overlap-param-gather
mchrzanowski 079e2a1
Merge branch 'main' into overlap-param-gather-muon-rebased
deepakn94 b22ecd3
Run autoformat.sh
mchrzanowski 64dd7ea
Merge branch 'main' into overlap-param-gather-muon-rebased
mchrzanowski 62b5ae2
Add missing docstring to fix lint error
mchrzanowski File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.