Skip to content

Conversation

@finbarrtimbers
Copy link
Collaborator

@finbarrtimbers finbarrtimbers commented Jan 26, 2026

Add vllm_dtype parameter to create_vllm_engines() to allow configuring the vLLM dtype. Defaults to bfloat16 for backwards compatibility. This is needed for the hybrid model.

Allow configuring the vLLM dtype via the vllm_dtype parameter instead
of hardcoding bfloat16. Defaults to bfloat16 for backwards compatibility.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@chatgpt-codex-connector
Copy link

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @finbarrtimbers, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the create_vllm_engines function by introducing a vllm_dtype parameter. This change provides users with the flexibility to explicitly define the data type for vLLM engines, moving beyond the previously hardcoded bfloat16 while maintaining it as the default for seamless integration. This allows for better optimization and control over model precision.

Highlights

  • New vllm_dtype parameter: Introduced vllm_dtype to the create_vllm_engines function, allowing users to configure the vLLM data type.
  • Default value for backward compatibility: The new vllm_dtype parameter defaults to bfloat16 to ensure backward compatibility with existing configurations.
  • Enhanced flexibility: This change enables the use of auto or other specified data types for vLLM engine initialization, providing greater control over model precision.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request successfully introduces the vllm_dtype parameter to the create_vllm_engines function, allowing for configurable vLLM data types. The default value of bfloat16 ensures backward compatibility, and the change correctly propagates this parameter to the vllm.AsyncEngineArgs constructor. This is a clear and beneficial improvement for flexibility.

@finbarrtimbers finbarrtimbers requested review from natolambert and removed request for hamishivi January 26, 2026 22:13
Copy link
Collaborator

@natolambert natolambert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants