ci: Mount and enforce HF_HOME by ko3n1g · Pull Request #3700 · NVIDIA/Megatron-LM

ko3n1g · 2026-03-04T16:10:14Z

What does this PR do ?

⚠️ For major changes (either in lines of code or in its impact), please make sure to first share a design doc with the team. If you're unsure what's the best way to do so, contact the @mcore-oncall.

Contribution process

flowchart LR
    A[Pre-checks] --> B[PR Tests]
    subgraph Code Review/Approval
        C1[Expert Review] --> C2[Final Review]
    end
    B --> C1
    C2 --> D[Merge]

Pre-checks

I want this PR in a versioned release and have added the appropriate Milestone (e.g., Core 0.8)
I have added relevant unit tests
I have added relevant functional tests
I have added proper typing to my code Typing guidelines
I have added relevant documentation
I have run the autoformatter.sh on my PR

Code review

The following process is enforced via the CODEOWNERS file for changes into megatron/core. For changes outside of megatron/core, it is up to the PR author whether or not to tag the Final Reviewer team.

For MRs into `main` branch

Feel free to message or comment the @mcore-oncall to help accelerate your merge into main. The less complex your PR is, the faster it will be approved and merged!

(Step 1): Add PR label `Expert Review`

(Step 2): Collect the expert reviewers reviews

Attach the Expert Review label when your PR is ready for review.
GitHub auto-assigns expert reviewers based on your changes. They will get notified and pick up your PR soon.

⚠️ Only proceed to the next step once all reviewers have approved, merge-conflict are resolved and the CI is passing.
Final Review might get declined if these requirements are not fulfilled.

(Step 3): Final Review

Add Final Review label
GitHub auto-assigns final reviewers based on your changes. They will get notified and pick up your PR soon.

(Optional Step 4): Cherry-pick into release branch

If this PR also needs to be merged into core_r* release branches, after this PR has been merged, select Cherry-pick to open a new PR into the release branch.

For MRs into `dev` branch

The proposed review process for `dev` branch is under active discussion.

MRs are mergable after one approval by either eharper@nvidia.com or zijiey@nvidia.com.

Merging your PR

Any member of core-adlr and core-nemo will be able to merge your PR.

Signed-off-by: oliver könig <okoenig@nvidia.com>

ko3n1g · 2026-03-04T18:06:48Z

/claude review

ko3n1g · 2026-03-04T18:07:44Z

/claude review

claude · 2026-03-04T18:10:54Z

.../core/inference/text_generation_server/dynamic_text_gen_server/endpoints/chat_completions.py

+            choice_data["kv_cache_staleness"] = result["kv_cache_staleness"]
+            choice_data["num_evictions"] = sum(
+                1 for e in result["events"] if e.get("type") == "EVICT"
+            )


Code Review

Bug: response shape change for previously-optional fields

The old code only included policy_staleness, kv_cache_staleness, and num_evictions in the response when they had meaningful values. The new code unconditionally includes them:

policy_staleness / kv_cache_staleness will be None for any request where those tensors were never populated (they're Optional[torch.Tensor] = None on DynamicInferenceRequest).

num_evictions will be 0 even when there were no eviction events.

This is a breaking change to the response schema — consumers that treat the absence of these keys as "not applicable" will now always see them. If always-present fields are intentional, that's fine, but it should be explicit and clients should be updated.

Potential TypeError from chat_template_kwargs overlap

prompt_tokens = tokenizer.apply_chat_template( messages, tokenize=True, add_generation_prompt=True, tools=req.get("tools", None), **req.get("chat_template_kwargs", {}), )

If a caller passes tokenize, add_generation_prompt, or tools inside chat_template_kwargs, Python raises TypeError: got multiple values for keyword argument. There is no validation to guard against this — consider an explicit check or stripping overlapping keys with a warning.

Missing test coverage

No tests cover the chat_completions endpoint response shape for these changed fields. The existing tests in tests/unit_tests/inference/engines/test_dynamic_engine.py exercise DynamicInferenceRequest directly, not the endpoint. Missing coverage:

policy_staleness / kv_cache_staleness always present (possibly None) in response

num_evictions always 0 when there are no EVICT events

chat_template_kwargs forwarding, including the overlap TypeError case

svcnvidia-nemo-ci · 2026-03-04T19:32:07Z

🔄 Merge queue validation started!

You can track the progress here: https://github.com/NVIDIA/Megatron-LM/actions/runs/22685751754

svcnvidia-nemo-ci · 2026-03-04T19:34:51Z

🔄 Merge queue validation started!

You can track the progress here: https://github.com/NVIDIA/Megatron-LM/actions/runs/22685854327

svcnvidia-nemo-ci · 2026-03-04T22:45:29Z

🔄 Merge queue validation started!

You can track the progress here: https://github.com/NVIDIA/Megatron-LM/actions/runs/22692993456

svcnvidia-nemo-ci · 2026-03-05T00:15:37Z

🔄 Merge queue validation started!

You can track the progress here: https://github.com/NVIDIA/Megatron-LM/actions/runs/22695852416

ci: Mount and enforce HF_HOME

6130f07

Signed-off-by: oliver könig <okoenig@nvidia.com>

ko3n1g requested a review from a team as a code owner March 4, 2026 16:10

ko3n1g added the Run functional tests label Mar 4, 2026

svcnvidia-nemo-ci requested a review from a team March 4, 2026 16:10

svcnvidia-nemo-ci added this to the Core 0.16 milestone Mar 4, 2026

copy-pr-bot bot temporarily deployed to test March 4, 2026 16:11 Inactive

ko3n1g force-pushed the ko3n1g/ci/hf-home branch from d136f6d to 6130f07 Compare March 4, 2026 16:55

mount hf home

c7c5212

Signed-off-by: oliver könig <okoenig@nvidia.com>

copy-pr-bot bot temporarily deployed to test March 4, 2026 17:02 Inactive

claude bot reviewed Mar 4, 2026

View reviewed changes

chtruong814 approved these changes Mar 4, 2026

View reviewed changes

ko3n1g enabled auto-merge March 4, 2026 19:31

ko3n1g added this pull request to the merge queue Mar 4, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ci: Mount and enforce HF_HOME#3700

ci: Mount and enforce HF_HOME#3700
ko3n1g wants to merge 2 commits intoNVIDIA:mainfrom
ko3n1g:ko3n1g/ci/hf-home

ko3n1g commented Mar 4, 2026

Uh oh!

ko3n1g commented Mar 4, 2026

Uh oh!

ko3n1g commented Mar 4, 2026

Uh oh!

claude bot Mar 4, 2026

Uh oh!

svcnvidia-nemo-ci commented Mar 4, 2026

Uh oh!

svcnvidia-nemo-ci commented Mar 4, 2026

Uh oh!

svcnvidia-nemo-ci commented Mar 4, 2026

Uh oh!

svcnvidia-nemo-ci commented Mar 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ko3n1g commented Mar 4, 2026

What does this PR do ?

Contribution process

Pre-checks

Code review

(Step 1): Add PR label Expert Review

(Step 2): Collect the expert reviewers reviews

(Step 3): Final Review

(Optional Step 4): Cherry-pick into release branch

Merging your PR

Uh oh!

ko3n1g commented Mar 4, 2026

Uh oh!

ko3n1g commented Mar 4, 2026

Uh oh!

claude bot Mar 4, 2026

Choose a reason for hiding this comment

Code Review

Bug: response shape change for previously-optional fields

Potential TypeError from chat_template_kwargs overlap

Missing test coverage

Uh oh!

svcnvidia-nemo-ci commented Mar 4, 2026

Uh oh!

svcnvidia-nemo-ci commented Mar 4, 2026

Uh oh!

svcnvidia-nemo-ci commented Mar 4, 2026

Uh oh!

svcnvidia-nemo-ci commented Mar 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

(Step 1): Add PR label `Expert Review`

Potential `TypeError` from `chat_template_kwargs` overlap