Fix some bugs by tastelikefeet · Pull Request #77 · modelscope/twinkle

tastelikefeet · 2026-02-27T03:34:21Z

PR type

Bug Fix
New Feature
Document Updates
More Models or Datasets Support

PR information

Add install.sh to guide env installation
Increase the weight sync buffer
fix ray resource manager

Experiment results

Paste your experiment result here(if needed).

gemini-code-assist · 2026-02-27T03:34:41Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces an automated environment setup script to simplify dependency installation, enhances checkpoint engine performance by increasing the weight synchronization buffer size, and resolves a bug in the Ray resource manager to ensure correct GPU resource allocation. These changes collectively aim to improve the project's stability, efficiency, and ease of use.

Highlights

Environment Setup Automation: An install.sh script was added to automate the installation of deep learning dependencies. This script detects GPU architecture, installs core Python packages like peft, accelerate, transformers, vllm, transformer_engine, megatron_core, flash-attn, numpy, and DeepGEMM, streamlining the setup process.
Checkpoint Engine Buffer Size Increase: The default bucket_size for HCCLCheckpointEngine and NCCLCheckpointEngine, as well as the _bucket_size in CheckpointEngineMixin, was increased from 2GB to 3GB. This change aims to improve the efficiency of weight synchronization during checkpointing.
Ray Resource Manager Fix: A bug in the Ray resource manager was addressed by explicitly initializing the node_ranks list. This ensures correct behavior and prevents potential errors when determining GPU allocations for workers.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

INSTALL.sh
- Added a new shell script to automate the installation of deep learning dependencies.
- Implemented GPU architecture detection using nvidia-smi to configure TORCH_CUDA_ARCH_LIST.
- Included installation steps for various Python packages: peft, accelerate, transformers, modelscope, oss2, vllm, transformer_engine, megatron_core, flash-attn, numpy, and DeepGEMM.
src/twinkle/checkpoint_engine/hccl_checkpoint_engine.py
- Increased the default bucket_size parameter in the HCCLCheckpointEngine constructor from 2GB to 3GB.
src/twinkle/checkpoint_engine/mixin.py
- Updated the default _bucket_size class variable from 2GB to 3GB within the CheckpointEngineMixin.
src/twinkle/checkpoint_engine/nccl_checkpoint_engine.py
- Modified the default bucket_size parameter in the NCCLCheckpointEngine constructor from 2GB to 3GB.
src/twinkle/infra/_ray/resource_manager.py
- Initialized the node_ranks list to an empty list before its usage within the get_visible_devices function to prevent potential runtime errors.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces an installation script, increases the weight synchronization buffer size across checkpoint engines, and fixes a potential issue in the Ray resource manager. The changes generally improve the robustness and performance of the system. However, there are a few areas that could be improved for better maintainability and clarity, particularly in the new installation script and an outdated comment.

INSTALL_MEGATRON.sh

src/twinkle/checkpoint_engine/mixin.py

tastelikefeet added 2 commits February 26, 2026 17:29

wip

c5350be

lint code

57ad57f

gemini-code-assist bot reviewed Feb 27, 2026

View reviewed changes

INSTALL_MEGATRON.sh Show resolved Hide resolved

INSTALL_MEGATRON.sh Show resolved Hide resolved

src/twinkle/checkpoint_engine/mixin.py Show resolved Hide resolved

hjh0119 approved these changes Feb 27, 2026

View reviewed changes

change comments

44c9f82

tastelikefeet merged commit d3f39f4 into modelscope:main Feb 27, 2026
1 of 3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix some bugs#77

Fix some bugs#77
tastelikefeet merged 3 commits intomodelscope:mainfrom
tastelikefeet:fix/0226-1

tastelikefeet commented Feb 27, 2026

Uh oh!

gemini-code-assist bot commented Feb 27, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

tastelikefeet commented Feb 27, 2026

PR type

PR information

Experiment results

Uh oh!

gemini-code-assist bot commented Feb 27, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants