Update Kimi-Linear.md for AMD GPU by haic0 · Pull Request #155 · vllm-project/recipes

haic0 · 2025-12-10T08:11:49Z

No description provided.

gemini-code-assist · 2025-12-10T08:12:03Z

Summary of Changes

Hello @haic0, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the documentation for the Kimi-Linear model by adding a comprehensive guide for its deployment and benchmarking on AMD MI300X GPUs. It covers the entire setup process, from preparing the Docker environment with ROCm vLLM to initiating the vLLM server and running performance tests, thereby broadening the model's accessibility and operational scope for users with AMD hardware.

Highlights

AMD GPU Support Documentation: A new section titled 'AMD GPU Support' has been added to the Kimi-Linear.md file, providing instructions for running the model on AMD MI300X GPUs.
Docker Environment Setup: Detailed steps are included for preparing the Docker environment, specifically pulling the rocm/vllm-dev:nightly image and launching a container with appropriate ROCm and vLLM configurations.
vLLM Server Configuration: Instructions for logging into Hugging Face and starting the vLLM server are provided, including specific environment variables (SAFETENSORS_FAST_GPU=1, VLLM_USE_V1=1, VLLM_USE_TRITON_FLASH_ATTN=0) and model parameters for moonshotai/Kimi-Linear-48B-A3B-Instruct.
Benchmark Execution: A command to run a benchmark script inside the Docker container is included, allowing users to test the performance of the Kimi-Linear model on their AMD GPU setup.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request adds instructions for running the Kimi-Linear model on AMD GPUs. The new documentation is clear and provides step-by-step guidance. My review includes a few suggestions to improve the readability and formatting of the commands and text, such as breaking up long commands, removing unnecessary blank lines, and clarifying instructions. These changes will help ensure the documentation is easy for users to follow.

moonshotai/Kimi-Linear.md

gemini-code-assist · 2025-12-10T08:13:30Z

moonshotai/Kimi-Linear.md

+Run the vllm online serving
+Sample Command


The section heading is already 'Start the vLLM server'. The text here is a bit redundant and could be more concise.

Suggested change

Run the vllm online serving

Sample Command

Run the following command to start the vLLM server:

moonshotai/Kimi-Linear.md

gemini-code-assist · 2025-12-10T08:13:30Z

moonshotai/Kimi-Linear.md

+
+
+


These trailing blank lines and whitespace at the end of the file can be removed for better formatting.

jeejeelee · 2025-12-10T08:29:40Z

TBH, I'm questioning whether it's truly necessary to differentiate between AMD and NVIDIA GPUs when running these models on vLLM. Looking at most of your pull requests, aside from the Docker image, there doesn't appear to be anything particularly distinctive

Signed-off-by: haic0 <149741444+haic0@users.noreply.github.com> Update moonshotai/Kimi-Linear.md Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: haic0 <149741444+haic0@users.noreply.github.com> Update moonshotai/Kimi-Linear.md Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: haic0 <149741444+haic0@users.noreply.github.com> Update vLLM ROCm Docker image and run commands Signed-off-by: jiacao-amd <jiahui.cao@amd.com> add uv launch support Signed-off-by: jiacao-amd <jiahui.cao@amd.com>

Merged ROCm installation and running instructions from separate AMD GPU Support section into main content with CUDA/ROCm subheaders for better organization and consistency. Signed-off-by: jiacao-amd <jiahui.cao@amd.com>

gemini-code-assist bot reviewed Dec 10, 2025

View reviewed changes

jiacao-amd force-pushed the patch-13 branch from 0f44105 to a26b559 Compare January 27, 2026 21:39

jiacao-amd force-pushed the patch-13 branch 3 times, most recently from cbf2886 to f37c30a Compare February 6, 2026 07:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update Kimi-Linear.md for AMD GPU#155

Update Kimi-Linear.md for AMD GPU#155
haic0 wants to merge 2 commits intovllm-project:mainfrom
haic0:patch-13

haic0 commented Dec 10, 2025

Uh oh!

gemini-code-assist bot commented Dec 10, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

gemini-code-assist bot Dec 10, 2025

Uh oh!

Uh oh!

gemini-code-assist bot Dec 10, 2025

Uh oh!

jeejeelee commented Dec 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	Run the vllm online serving
	Sample Command
	Run the following command to start the vLLM server:

Conversation

haic0 commented Dec 10, 2025

Uh oh!

gemini-code-assist bot commented Dec 10, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

gemini-code-assist bot Dec 10, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gemini-code-assist bot Dec 10, 2025

Choose a reason for hiding this comment

Uh oh!

jeejeelee commented Dec 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants