Skip to content

Add Kimi K2.5 INT4 single-node MI325X vLLM benchmark (TP8)#901

Merged
cquil11 merged 4 commits intomainfrom
re-add-kimik2.5-int4-mi325x
Mar 10, 2026
Merged

Add Kimi K2.5 INT4 single-node MI325X vLLM benchmark (TP8)#901
cquil11 merged 4 commits intomainfrom
re-add-kimik2.5-int4-mi325x

Conversation

@cquil11
Copy link
Collaborator

@cquil11 cquil11 commented Mar 10, 2026

following AMD andy's recipe https://x.com/linluo77/status/2017024513595301985

Add Kimi K2.5 INT4 single-node MI325X vLLM benchmark (TP8) using vLLM ROCm v0.16.0, based on MI355X INT4 recipe with AMD Andy Luo's recipe comment.

Closes #856

Generated with Claude Code

github-actions bot and others added 3 commits March 10, 2026 10:23
- Add benchmark script benchmarks/single_node/kimik2.5_int4_mi325x.sh
  based on MI355X INT4 recipe with AMD Andy Luo's recipe comment
- Add kimik2.5-int4-mi325x-vllm config to amd-master.yaml using
  vllm/vllm-openai-rocm:v0.16.0 image
- Update perf-changelog.yaml

Closes #856

Co-authored-by: functionstackx <functionstackx@users.noreply.github.com>
@github-actions
Copy link
Contributor

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

1 similar comment
@github-actions
Copy link
Contributor

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

@cquil11 cquil11 changed the title Re-add Kimi K2.5 INT4 single-node MI325X vLLM benchmark (TP8) Add Kimi K2.5 INT4 single-node MI325X vLLM benchmark (TP8) Mar 10, 2026
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@cquil11 cquil11 merged commit 12ed773 into main Mar 10, 2026
@cquil11 cquil11 deleted the re-add-kimik2.5-int4-mi325x branch March 10, 2026 15:31
Copy link
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — straightforward benchmark config addition following established patterns.

Extended reasoning...

Overview

This PR re-adds the Kimi K2.5 INT4 MI325X vLLM benchmark that was previously merged in #857 and reverted in #900. It touches three files: a new benchmark shell script (benchmarks/single_node/kimik2.5_int4_mi325x.sh), a config entry in .github/configs/amd-master.yaml, and a changelog entry in perf-changelog.yaml.

Security risks

None. The changes are benchmark configuration and a shell script that follows the exact same patterns as dozens of other benchmark scripts in this repo. No auth, crypto, or permission-related code is involved.

Level of scrutiny

Low scrutiny is appropriate. This is a mechanical addition of a new benchmark configuration that mirrors the existing kimik2.5-int4-mi355x-vllm entry, differing only in the runner (mi325x vs mi355x) and vLLM image version (v0.16.0 vs v0.15.1). The benchmark script follows established conventions used across the repo.

Other factors

The only bug found is a minor nit: the pr-link in perf-changelog.yaml points to the reverted PR #857 instead of this PR #901. This is a documentation/traceability issue that does not affect functionality. No outstanding reviewer comments exist beyond automated bot reminders. The PR is small, self-contained, and clearly intentional as a re-opening of reverted work.

description:
- "Add Kimi K2.5 INT4 single-node MI325X vLLM benchmark (TP8)"
- "Uses vLLM ROCm v0.16.0 image following AMD Andy Luo's recipe"
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/857
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 The pr-link on line 6 points to PR #857, which was reverted by PR #900. Since this PR #901 is the one that actually lands the Kimi K2.5 INT4 MI325X benchmark, the link should be updated to https://github.com/SemiAnalysisAI/InferenceX/pull/901 so the changelog traces back to the correct, non-reverted PR.

Extended reasoning...

What the bug is

The perf-changelog.yaml entry for kimik2.5-int4-mi325x-vllm has its pr-link field set to https://github.com/SemiAnalysisAI/InferenceX/pull/857. However, PR #857 was reverted by PR #900 (commit aaec16f). This PR #901 re-introduces the exact same changes, as stated in the PR description: "Re-opens the changes from #857 (which was reverted in #900)".

How it manifests

Anyone reviewing the performance changelog and clicking the PR link for the Kimi K2.5 INT4 MI325X benchmark will be taken to PR #857, which is a reverted/closed PR. This is confusing because:

  1. The reverted PR may show a "reverted" status or be closed, making it unclear whether the benchmark is active.
  2. The discussion and review history on Add Kimi K2.5 INT4 single-node MI325X vLLM benchmark (TP8) #857 is stale — the authoritative review and merge context lives on Add Kimi K2.5 INT4 single-node MI325X vLLM benchmark (TP8) #901.
  3. Other changelog entries consistently link to the PR that actually merges the change (e.g., #734 for kimik2.5-int4-mi355x-vllm, #825 for kimik2.5-fp4-mi355x-vllm).

Step-by-step proof

  1. Look at the git log: commit f7135ac merged PR Add Kimi K2.5 INT4 single-node MI325X vLLM benchmark (TP8) #857 adding the Kimi K2.5 INT4 MI325X benchmark.
  2. Commit aaec16f then reverted Add Kimi K2.5 INT4 single-node MI325X vLLM benchmark (TP8) #857 via PR Revert "Add Kimi K2.5 INT4 single-node MI325X vLLM benchmark (TP8)" #900 with message: Revert "Add Kimi K2.5 INT4 single-node MI325X vLLM benchmark (TP8) (#857)" (#900) [skip-sweep].
  3. PR Add Kimi K2.5 INT4 single-node MI325X vLLM benchmark (TP8) #901 re-adds the same changes. The diff shows line 6 of perf-changelog.yaml still references pull/857.
  4. Following https://github.com/SemiAnalysisAI/InferenceX/pull/857 leads to a reverted PR, not the one that actually lands the config.

Impact

This is a minor documentation/traceability issue. It does not affect benchmark correctness or CI functionality. However, it breaks the convention that pr-link points to the PR that merges the change, which is important for changelog auditing and attribution.

Fix

Change line 6 of perf-changelog.yaml from:

  pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/857

to:

  pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/901

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Development

Successfully merging this pull request may close these issues.

vllm 0.16 single node mi325 kimi k2.5 vllm tp8

2 participants