Skip to content

[pull] main from NVIDIA:main#503

Merged
pull[bot] merged 2 commits intophu0ngng:mainfrom
NVIDIA:main
Mar 3, 2026
Merged

[pull] main from NVIDIA:main#503
pull[bot] merged 2 commits intophu0ngng:mainfrom
NVIDIA:main

Conversation

@pull
Copy link

@pull pull bot commented Mar 3, 2026

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

vthumbe1503 and others added 2 commits March 3, 2026 06:28
* add all the optimizations

Signed-off-by: Varun Thumbe <vthumbe@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* requires_grad optimization

Signed-off-by: Varun Thumbe <vthumbe@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* test if commenting out requires_grad works

Signed-off-by: Varun Thumbe <vthumbe@nvidia.com>

* fix minor bug

Signed-off-by: Varun Thumbe <vthumbe@nvidia.com>

* fix ci

Signed-off-by: Varun Thumbe <vthumbe@nvidia.com>

* missed a bug

Signed-off-by: Varun Thumbe <vthumbe@nvidia.com>

* Update transformer_engine/pytorch/csrc/quantizer.cpp

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Signed-off-by: vthumbe1503 <vthumbe@nvidia.com>

* fix some bugs pointed to by copilot

Signed-off-by: Varun Thumbe <vthumbe@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* linting error

Signed-off-by: Varun Thumbe <vthumbe@nvidia.com>

* fix the error

Signed-off-by: Varun Thumbe <vthumbe@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix the bug

Signed-off-by: Varun Thumbe <vthumbe@nvidia.com>

* get rid of the change

Signed-off-by: Varun Thumbe <vthumbe@nvidia.com>

* fix the transpose shape bug

Signed-off-by: Varun Thumbe <vthumbe@nvidia.com>

* minor linter fix

Signed-off-by: Varun Thumbe <vthumbe@nvidia.com>

* fix lint

Signed-off-by: Varun Thumbe <vthumbe@nvidia.com>

* fix linting error

Signed-off-by: Varun Thumbe <vthumbe@nvidia.com>

* address copilot review comment regarding error check when both data and transpose are None

Signed-off-by: Varun Thumbe <vthumbe@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix linting errors

Signed-off-by: Varun Thumbe <vthumbe@nvidia.com>

* missed a merge conflict

Signed-off-by: Varun Thumbe <vthumbe@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* final optimizations

Signed-off-by: Varun Thumbe <vthumbe@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix ci error

Signed-off-by: Varun Thumbe <vthumbe@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* address review comment from greptile

Signed-off-by: Varun Thumbe <vthumbe@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* address review comment + stride optimization

Signed-off-by: Varun Thumbe <vthumbe@nvidia.com>

* address linter issue

Signed-off-by: Varun Thumbe <vthumbe@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* minor lint

Signed-off-by: Varun Thumbe <vthumbe@nvidia.com>

* fix ci bug

Signed-off-by: Varun Thumbe <vthumbe@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* another optimization to do at::native::empty_cuda directly instead of at::empty

Signed-off-by: Varun Thumbe <vthumbe@nvidia.com>

* cleanups

Signed-off-by: Varun Thumbe <vthumbe@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* better solution for device

Signed-off-by: Varun Thumbe <vthumbe@nvidia.com>

* enum to int cache

Signed-off-by: Varun Thumbe <vthumbe@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove unused function

Signed-off-by: Varun Thumbe <vthumbe@nvidia.com>

* Update transformer_engine/pytorch/tensor/float8_blockwise_tensor.py

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Signed-off-by: vthumbe1503 <vthumbe@nvidia.com>

* index instead of device bug

Signed-off-by: Varun Thumbe <vthumbe@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix ci:

Signed-off-by: Varun Thumbe <vthumbe@nvidia.com>

* debug quantized tensor fix

Signed-off-by: Varun Thumbe <vthumbe@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* revert cudnnt front end change

Signed-off-by: Varun Thumbe <vthumbe@nvidia.com>

---------

Signed-off-by: Varun Thumbe <vthumbe@nvidia.com>
Signed-off-by: vthumbe1503 <vthumbe@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: Kirthi Shankar Sivamani <ksivamani@nvidia.com>
fix fast_set_attr in other nn modules for fsdp

Signed-off-by: Varun Thumbe <vthumbe@nvidia.com>
@pull pull bot locked and limited conversation to collaborators Mar 3, 2026
@pull pull bot added the ⤵️ pull label Mar 3, 2026
@pull pull bot merged commit c68ec31 into phu0ngng:main Mar 3, 2026
7 of 9 checks passed
@pull pull bot had a problem deploying to github-pages March 3, 2026 04:33 Failure
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant