Commit b14a3b6

authored

Make FP8 weights compatible with older MCore version (NVIDIA#2342)

* Make cast_master_weights_to_fp8 compatible with older MCore version Signed-off-by: kunlunl <kunlunl@nvidia.com> * Rename keep_columnwise to manual_post_all_gather_processing & Optimize unit test Signed-off-by: kunlunl <kunlunl@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove redundant _test_mini_optimizer() Signed-off-by: kunlunl <kunlunl@nvidia.com> --------- Signed-off-by: kunlunl <kunlunl@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>

1 parent f3b97c2 commit b14a3b6Copy full SHA for b14a3b6

3 files changed

+771

-726

lines changed

tests/pytorch/distributed
- run_cast_master_weights_to_fp8.py
- test_cast_master_weights_to_fp8.py
transformer_engine/pytorch/tensor
- utils.py

3 files changed

+771

-726

lines changed

Comments

(0)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Commit b14a3b6

3 files changed

3 files changed

File tree

3 files changed

3 files changed

0 commit comments