[NVIDIA] Update NVIDIA GPT-OSS vLLM image from v0.15.1 to v0.16.0#904
[NVIDIA] Update NVIDIA GPT-OSS vLLM image from v0.15.1 to v0.16.0#904
Conversation
|
Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you |
|
@ankursingh-nv re-trigger |
The H200 cluster nodes register GPUs as gpu:nvidia_h200, not generic gpu, so salloc requires the full type in --gres. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This reverts commit 1e991b0.
There was a problem hiding this comment.
LGTM — straightforward image tag bump from v0.15.1 to v0.16.0-cu130 across three NVIDIA GPT-OSS vLLM configs, with a minor nit on the changelog pr-link.
Extended reasoning...
Overview
This PR modifies two files: .github/configs/nvidia-master.yaml (updating the vllm/vllm-openai image tag from v0.15.1 to v0.16.0-cu130 in three places — B200, H100, and H200 GPT-OSS configs) and perf-changelog.yaml (adding a corresponding changelog entry). No config flags, parameters, or logic were changed beyond the image tag.
Security risks
None. This is a Docker image tag update in benchmark configuration files. No authentication, authorization, or data-handling code is involved.
Level of scrutiny
Low scrutiny is appropriate. This is a mechanical, pattern-following config change — identical in structure to dozens of prior changelog entries and image bumps visible in the same files. The PR re-lands a previously merged-then-reverted change (PR #800 → reverted by #903 → re-landed here as #904) with a slightly different image tag (v0.16.0-cu130 instead of v0.16.0).
Other factors
The one nit found (pr-link referencing #800 instead of #904) is valid but cosmetic — it affects changelog traceability, not functionality. The inline comment already flags this. The PR description is clear, the scope is minimal, and the change follows the established pattern throughout the repository.
| - gptoss-fp4-h200-vllm | ||
| description: | ||
| - "Update vLLM image from v0.15.1 to v0.16.0-cu130 for NVIDIA GPT-OSS configs" | ||
| pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/800 |
There was a problem hiding this comment.
🟡 The pr-link for the new changelog entry points to PR #800, which was already merged and then reverted (PR #903). Since this PR #904 is the one actually re-landing the change, the link should be #904 for proper traceability.
Extended reasoning...
What the bug is
The new perf-changelog.yaml entry added at the bottom of the file has its pr-link set to #800. However, PR #800 was previously merged (commit da55158), then reverted by PR #903 (commit cad1169). This PR #904 is the one that actually re-lands the vLLM image update (now with the -cu130 suffix), so the changelog entry should reference #904.
Step-by-step proof
- PR [NVIDIA] Update NVIDIA GPT-OSS vLLM image from v0.15.1 to v0.16.0 #800 was merged as commit da55158, updating the NVIDIA GPT-OSS vLLM image from v0.15.1 to v0.16.0.
- PR Revert "[NVIDIA] Update NVIDIA GPT-OSS vLLM image from v0.15.1 to v0.16.0" #903 reverted that change (commit cad1169: Revert [NVIDIA] Update NVIDIA GPT-OSS vLLM image from v0.15.1 to v0.16.0 ([NVIDIA] Update NVIDIA GPT-OSS vLLM image from v0.15.1 to v0.16.0 #800)).
- The revert was then reverted (commit c28ce20), and this PR [NVIDIA] Update NVIDIA GPT-OSS vLLM image from v0.15.1 to v0.16.0 #904 re-lands the change with a slightly different image tag (v0.16.0-cu130 instead of v0.16.0).
- The diff shows line 941: pr-link: [NVIDIA] Update NVIDIA GPT-OSS vLLM image from v0.15.1 to v0.16.0 #800 — this references the reverted PR.
Why existing code does not prevent it
There is no automated validation that pr-link values reference the current PR. The author likely copied the link from the original PR #800 submission without updating it for the re-landing PR.
Impact
Anyone following the changelog link to understand this change would land on PR #800, which is marked as reverted. This is confusing for traceability, though it does not affect any functional behavior. The convention throughout perf-changelog.yaml is that pr-link references the PR that actually lands the change.
Fix
Change line 941 from:
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/800
to:
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/904
Bump vllm/vllm-openai image tag for all 3 NVIDIA GPT-OSS configs (B200, H100, H200). All existing BKC flags preserved — no config changes beyond the image tag.
v0.16.0 notable changes for GPT-OSS/MXFP4:
Closes #798