Skip to content

[pull] main from NVIDIA:main#501

Merged
pull[bot] merged 3 commits intophu0ngng:mainfrom
NVIDIA:main
Mar 2, 2026
Merged

[pull] main from NVIDIA:main#501
pull[bot] merged 3 commits intophu0ngng:mainfrom
NVIDIA:main

Conversation

@pull
Copy link

@pull pull bot commented Mar 2, 2026

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

buptzyb and others added 3 commits March 2, 2026 13:30
…#2715)

Remove is_first_microbatch setting after warmup

Signed-off-by: Robin Zhang <robinz@nvidia.com>
#2720)

* fix topk=1

Signed-off-by: tongliu <tongliu@nvidia.com>

* add topk=1 ut

Signed-off-by: tongliu <tongliu@nvidia.com>

---------

Signed-off-by: tongliu <tongliu@nvidia.com>
* support cuda graph capture offloading module

Signed-off-by: Hongbin Liu <hongbinl@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove reset_hook and init_chunk_handler_hook

Signed-off-by: Hongbin Liu <hongbinl@nvidia.com>

* remove reset_hook and init_chunk_handler_hook

Signed-off-by: Hongbin Liu <hongbinl@nvidia.com>

* minor fix

Signed-off-by: root <root@eos0046.eos.clusters.nvidia.com>

* temp fix overlap-grad-reduce

Signed-off-by: Hongbin Liu <hongbinl@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* reuse mark_not_offload() and do not offload scale_inv

Signed-off-by: Hongbin Liu <hongbinl@nvidia.com>

* temp fix for mxfp8

Signed-off-by: Hongbin Liu <hongbinl@nvidia.com>

* fix bug for record_stream and from_blob

Signed-off-by: Hongbin Liu <hongbinl@nvidia.com>

* disable offloading core_attn_out and refine cpu overhead of at::empty

Signed-off-by: Hongbin Liu <hongbinl@nvidia.com>

* minor fix

Signed-off-by: Hongbin Liu <hongbinl@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* return ptr of whole buffer and offload the whole buffer

Signed-off-by: Hongbin Liu <hongbinl@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Apply suggestions from code revie

Signed-off-by: Hongbin Liu <hongbinl@nvidia.com>

* remove code changes of offloading and quantizer

Signed-off-by: Hongbin Liu <hongbinl@nvidia.com>

* minor fix

Signed-off-by: Hongbin Liu <hongbinl@nvidia.com>

* minor fix

Signed-off-by: Hongbin Liu <hongbinl@nvidia.com>

* minor fix

Signed-off-by: Hongbin Liu <hongbinl@nvidia.com>

* minor fix

Signed-off-by: Hongbin Liu <hongbinl@nvidia.com>

* minor fix

Signed-off-by: Hongbin Liu <hongbinl@nvidia.com>

* add docstring

Signed-off-by: Hongbin Liu <hongbinl@nvidia.com>

---------

Signed-off-by: Hongbin Liu <hongbinl@nvidia.com>
Signed-off-by: root <root@eos0046.eos.clusters.nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: root <root@eos0046.eos.clusters.nvidia.com>
Co-authored-by: root <root@eos0022.eos.clusters.nvidia.com>
@pull pull bot locked and limited conversation to collaborators Mar 2, 2026
@pull pull bot added the ⤵️ pull label Mar 2, 2026
@pull pull bot merged commit bba7bf6 into phu0ngng:main Mar 2, 2026
@pull pull bot had a problem deploying to github-pages March 2, 2026 10:33 Failure
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants