Add Kimi K2.5 INT4 single-node MI325X vLLM benchmark (TP8)#901
Conversation
- Add benchmark script benchmarks/single_node/kimik2.5_int4_mi325x.sh based on MI355X INT4 recipe with AMD Andy Luo's recipe comment - Add kimik2.5-int4-mi325x-vllm config to amd-master.yaml using vllm/vllm-openai-rocm:v0.16.0 image - Update perf-changelog.yaml Closes #856 Co-authored-by: functionstackx <functionstackx@users.noreply.github.com>
|
Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you |
1 similar comment
|
Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you |
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
LGTM — straightforward benchmark config addition following established patterns.
Extended reasoning...
Overview
This PR re-adds the Kimi K2.5 INT4 MI325X vLLM benchmark that was previously merged in #857 and reverted in #900. It touches three files: a new benchmark shell script (benchmarks/single_node/kimik2.5_int4_mi325x.sh), a config entry in .github/configs/amd-master.yaml, and a changelog entry in perf-changelog.yaml.
Security risks
None. The changes are benchmark configuration and a shell script that follows the exact same patterns as dozens of other benchmark scripts in this repo. No auth, crypto, or permission-related code is involved.
Level of scrutiny
Low scrutiny is appropriate. This is a mechanical addition of a new benchmark configuration that mirrors the existing kimik2.5-int4-mi355x-vllm entry, differing only in the runner (mi325x vs mi355x) and vLLM image version (v0.16.0 vs v0.15.1). The benchmark script follows established conventions used across the repo.
Other factors
The only bug found is a minor nit: the pr-link in perf-changelog.yaml points to the reverted PR #857 instead of this PR #901. This is a documentation/traceability issue that does not affect functionality. No outstanding reviewer comments exist beyond automated bot reminders. The PR is small, self-contained, and clearly intentional as a re-opening of reverted work.
| description: | ||
| - "Add Kimi K2.5 INT4 single-node MI325X vLLM benchmark (TP8)" | ||
| - "Uses vLLM ROCm v0.16.0 image following AMD Andy Luo's recipe" | ||
| pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/857 |
There was a problem hiding this comment.
🟡 The pr-link on line 6 points to PR #857, which was reverted by PR #900. Since this PR #901 is the one that actually lands the Kimi K2.5 INT4 MI325X benchmark, the link should be updated to https://github.com/SemiAnalysisAI/InferenceX/pull/901 so the changelog traces back to the correct, non-reverted PR.
Extended reasoning...
What the bug is
The perf-changelog.yaml entry for kimik2.5-int4-mi325x-vllm has its pr-link field set to https://github.com/SemiAnalysisAI/InferenceX/pull/857. However, PR #857 was reverted by PR #900 (commit aaec16f). This PR #901 re-introduces the exact same changes, as stated in the PR description: "Re-opens the changes from #857 (which was reverted in #900)".
How it manifests
Anyone reviewing the performance changelog and clicking the PR link for the Kimi K2.5 INT4 MI325X benchmark will be taken to PR #857, which is a reverted/closed PR. This is confusing because:
- The reverted PR may show a "reverted" status or be closed, making it unclear whether the benchmark is active.
- The discussion and review history on Add Kimi K2.5 INT4 single-node MI325X vLLM benchmark (TP8) #857 is stale — the authoritative review and merge context lives on Add Kimi K2.5 INT4 single-node MI325X vLLM benchmark (TP8) #901.
- Other changelog entries consistently link to the PR that actually merges the change (e.g.,
#734forkimik2.5-int4-mi355x-vllm,#825forkimik2.5-fp4-mi355x-vllm).
Step-by-step proof
- Look at the git log: commit
f7135acmerged PR Add Kimi K2.5 INT4 single-node MI325X vLLM benchmark (TP8) #857 adding the Kimi K2.5 INT4 MI325X benchmark. - Commit
aaec16fthen reverted Add Kimi K2.5 INT4 single-node MI325X vLLM benchmark (TP8) #857 via PR Revert "Add Kimi K2.5 INT4 single-node MI325X vLLM benchmark (TP8)" #900 with message:Revert "Add Kimi K2.5 INT4 single-node MI325X vLLM benchmark (TP8) (#857)" (#900) [skip-sweep]. - PR Add Kimi K2.5 INT4 single-node MI325X vLLM benchmark (TP8) #901 re-adds the same changes. The diff shows line 6 of
perf-changelog.yamlstill referencespull/857. - Following
https://github.com/SemiAnalysisAI/InferenceX/pull/857leads to a reverted PR, not the one that actually lands the config.
Impact
This is a minor documentation/traceability issue. It does not affect benchmark correctness or CI functionality. However, it breaks the convention that pr-link points to the PR that merges the change, which is important for changelog auditing and attribution.
Fix
Change line 6 of perf-changelog.yaml from:
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/857to:
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/901
following AMD andy's recipe https://x.com/linluo77/status/2017024513595301985
Add Kimi K2.5 INT4 single-node MI325X vLLM benchmark (TP8) using vLLM ROCm v0.16.0, based on MI355X INT4 recipe with AMD Andy Luo's recipe comment.
Closes #856
Generated with Claude Code