sp fix ci test hang by meichangsu1 · Pull Request #52 · modelscope/twinkle

meichangsu1 · 2026-02-11T07:43:19Z

No description provided.

Replace calls to `_get_sp_group_from_device_mesh` with direct access to `sequence_parallel._sp_group` in sequence parallel attention tests. This simplifies the test setup by using the already initialized group stored in the module, improving code clarity and reducing redundancy.

gemini-code-assist · 2026-02-11T07:46:59Z

Summary of Changes

Hello @meichangsu1, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the project's development ecosystem by upgrading its CI/CD capabilities, standardizing developer interactions through new GitHub templates, and providing a comprehensive suite of documentation. It also expands the project's utility with new cookbook examples that demonstrate diverse training and inference patterns, including specialized support for NPU hardware, all aimed at improving developer experience and broadening the framework's applicability.

Highlights

CI/CD Infrastructure Enhancements: Introduced new scripts for building documentation and running containerized CI tests, including specific support for NPU environments, to streamline the continuous integration and deployment process.
Developer Workflow Improvements: Added comprehensive GitHub templates for bug reports, feature requests, questions, pull requests, and security policies, along with detailed guidelines for AI coding agents, to foster a more organized and efficient development environment.
Extensive Documentation Overhaul: Provided a wealth of new and updated documentation, including English and Chinese READMEs, installation guides, NPU support, client-server architecture details, and in-depth component descriptions, significantly improving clarity and accessibility for users and contributors.
New Cookbook Examples: Integrated numerous new cookbook examples demonstrating various training and sampling scenarios for both Tinker and native Twinkle clients, covering SFT, GRPO, Megatron, and Transformers models, showcasing diverse use cases and hardware configurations.
Automated Client Generation: Added a client generation tool to automatically create client wrappers for remote functions and models, enhancing the robustness and maintainability of the client-server architecture.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

.dev_scripts/build_docs.sh
- Added a script for building project documentation.
.dev_scripts/ci_container_test.sh
- Added a script for running containerized CI tests, including environment setup, linter checks, dependency installation, and cache cleanup.
.dev_scripts/dockerci.sh
- Added a script for Docker-based CI, managing GPU allocation and container execution for tests.
.dev_scripts/dockerci_npu.sh
- Added a script for NPU-specific Docker-based CI, handling GPU allocation and test execution.
.github/ISSUE_TEMPLATE/1-bug-report.yml
- Added a new bug report issue template.
.github/ISSUE_TEMPLATE/2-feature-request.yml
- Added a new feature request issue template.
.github/ISSUE_TEMPLATE/3-question-discussion.yml
- Added a new question and discussion issue template.
.github/ISSUE_TEMPLATE/config.yml
- Added configuration for issue templates.
.github/PULL_REQUEST_TEMPLATE.md
- Added a new pull request template.
.github/SECURITY.md
- Added a security policy document.
.github/copilot-instructions.md
- Added guidelines for AI coding agents.
.gitignore
- Updated to ignore new lock files, qoder files, and test cookbook directories.
.pre-commit-config.yaml
- Updated pre-commit hook versions and excluded paths.
.pre-commit-config_local.yaml
- Updated pre-commit hook versions and excluded paths.
CONTRIBUTING.md
- Updated contributor guidelines, changing project name references from 'SWIFT' to 'twinkle' and revising sections on contributions, incentives, and code standards.
CONTRIBUTING_CN.md
- Updated Chinese contributor guidelines, changing project name references from 'SWIFT' to 'twinkle' and revising sections on contributions, incentives, and code standards.
README.md
- Added comprehensive project overview, installation instructions, tutorials, changelog, supported hardware, supported models, sample code, architecture design, multi-tenancy features, and modular ecosystem details.
README_ZH.md
- Added Chinese version of the comprehensive project overview, installation instructions, tutorials, changelog, supported hardware, supported models, sample code, architecture design, multi-tenancy features, and modular ecosystem details.
ROADMAP.md
- Added project roadmap for versions 0.1 and 0.2, detailing core and networking capabilities in both Chinese and English.
client_tools/client_generator.py
- Added a script to auto-generate client wrappers for remote functions and models.
cookbook/client/tinker/grpo.py
- Added Tinker-compatible client example for GRPO training.
cookbook/client/tinker/gsm8k_grpo.py
- Added Tinker-compatible client example for GSM8K GRPO training.
cookbook/client/tinker/lora.py
- Added Tinker-compatible client example for LoRA training.
cookbook/client/tinker/megatron/server.py
- Added Tinker-compatible Megatron server startup script.
cookbook/client/tinker/megatron/server_config.yaml
- Added Tinker-compatible Megatron server configuration.
cookbook/client/tinker/megatron/server_config_7b.yaml
- Added Tinker-compatible Megatron server configuration for 7B models.
cookbook/client/tinker/sample.py
- Added Tinker-compatible client example for sampling/inference.
cookbook/client/tinker/self_congnition.py
- Added Tinker-compatible client example for self-cognition training and evaluation.
cookbook/client/tinker/transformer/server.py
- Added Tinker-compatible Transformers server startup script.
cookbook/client/tinker/transformer/server_config.yaml
- Added Tinker-compatible Transformers server configuration.
cookbook/client/twinkle/grpo.py
- Added Twinkle client example for GRPO training.
cookbook/client/twinkle/megatron/server.py
- Added Twinkle client Megatron server startup script.
cookbook/client/twinkle/megatron/server_config.yaml
- Added Twinkle client Megatron server configuration.
cookbook/client/twinkle/sample.py
- Added Twinkle client example for sampler inference.
cookbook/client/twinkle/self_congnition.py
- Added Twinkle client example for self-cognition training.
cookbook/client/twinkle/transformer/server.py
- Added Twinkle client Transformers server startup script.
cookbook/client/twinkle/transformer/server_config.yaml
- Added Twinkle client Transformers server configuration.
cookbook/legacy/components/dataset.py
- Added a legacy dataset component example.
cookbook/legacy/grpo/dapo_math.py
- Added a legacy DAPO-Math GRPO training example.
cookbook/legacy/grpo/gsm8k.py
- Added a legacy GSM8K GRPO training example.
cookbook/legacy/grpo/gsm8k_dense.py
- Added a legacy GSM8K dense GRPO training example.
cookbook/legacy/grpo/lora.py
- Added a legacy LoRA GRPO training example.
cookbook/legacy/grpo/lora_backup.py
- Added a legacy LoRA GRPO training backup example.
cookbook/legacy/grpo/lora_pr.py
- Added a legacy LoRA GRPO training PR example.
cookbook/legacy/lora.py
- Added a legacy Megatron-Core LoRA training example.
cookbook/legacy/moe_lora.py
- Added a legacy Megatron-Core MoE LoRA training example.
cookbook/legacy/npu/lora_npu.py
- Added a legacy NPU LoRA training example.
cookbook/legacy/remote/tinker/ascend/lora.py
- Added a legacy remote Tinker Ascend LoRA example.
cookbook/legacy/remote/tinker/ascend/server.py
- Added a legacy remote Tinker Ascend server startup script.
cookbook/legacy/remote/tinker/ascend/server_config.yaml
- Added a legacy remote Tinker Ascend server configuration.
cookbook/legacy/remote/tinker/lora.py
- Added a legacy remote Tinker LoRA example.
cookbook/legacy/remote/tinker/server.py
- Added a legacy remote Tinker server startup script.
cookbook/legacy/remote/tinker/server_config.yaml
- Added a legacy remote Tinker server configuration.
cookbook/legacy/remote/twinkle/lora.py
- Added a legacy remote Twinkle LoRA example.
cookbook/legacy/remote/twinkle/server.py
- Added a legacy remote Twinkle server startup script.
cookbook/legacy/remote/twinkle/server_config.yaml
- Added a legacy remote Twinkle server configuration.
cookbook/legacy/sampler/sampler_demo.py
- Added a legacy sampler demo.
cookbook/legacy/sft/ep_fsdp_qwen3_moe.py
- Added a legacy EP FSDP Qwen3 MoE SFT example.
cookbook/legacy/sft/full_sft.py
- Added a legacy full SFT example.
cookbook/legacy/sft/local_dataset.py
- Added a legacy local dataset SFT example.
cookbook/legacy/sft/multi_lora.py
- Added a legacy multi-LoRA SFT example.
cookbook/legacy/sft/single_program_megatron.py
- Added a legacy single program Megatron SFT example.
cookbook/legacy/sft/single_program_megatron_full.py
- Added a legacy single program full Megatron SFT example.
cookbook/legacy/sft/single_program_moe.py
- Added a legacy single program MoE SFT example.
cookbook/legacy/sft/streaming_dataset.py
- Added a legacy streaming dataset SFT example.
cookbook/legacy/sft/vlm_lora.py
- Added a legacy VLM LoRA SFT example.
cookbook/legacy/single_program_full.py
- Added a legacy single program full training example.
cookbook/legacy/vlm_lora.py
- Added a legacy VLM LoRA training example.
cookbook/megatron/tp.py
- Added a Megatron TP training example.
cookbook/megatron/tp.sh
- Added a Megatron TP training script.
cookbook/megatron/tp_moe.py
- Added a Megatron TP MoE training example.
cookbook/megatron/tp_moe.sh
- Added a Megatron TP MoE training script.
cookbook/ray/run.sh
- Added a Ray run script.
cookbook/ray/single_controller.py
- Added a Ray single controller example.
cookbook/transformers/ep_fsdp_qwen3_moe.py
- Added a Transformers EP FSDP Qwen3 MoE example.
cookbook/transformers/ep_fsdp_qwen3_moe.sh
- Added a Transformers EP FSDP Qwen3 MoE script.
cookbook/transformers/fsdp2.py
- Added a Transformers FSDP2 example.
cookbook/transformers/fsdp2.sh
- Added a Transformers FSDP2 script.
cookbook/transformers/fsdp2_moe.py
- Added a Transformers FSDP2 MoE example.
cookbook/transformers/fsdp2_moe.sh
- Added a Transformers FSDP2 MoE script.
cookbook/transformers/sp_fsdp_dense.py
- Added a Transformers SP FSDP dense example.
cookbook/transformers/sp_fsdp_dense.sh
- Added a Transformers SP FSDP dense script.
docs/Makefile
- Added a Makefile for Sphinx documentation.
docs/README.md
- Added a documentation maintenance guide.
docs/make.bat
- Added a Windows batch file for Sphinx documentation.
docs/source/.readthedocs.yaml
- Added ReadTheDocs configuration for Chinese documentation.
docs/source/_templates/autosummary/class.rst
- Added a Sphinx autosummary template for classes.
docs/source/_templates/classtemplate.rst
- Added a Sphinx class template.
docs/source/_templates/sobolengine.rst
- Added a Sphinx template for SobolEngine.
docs/source/conf.py
- Added a Sphinx configuration file for Chinese documentation.
docs/source/index.rst
- Added a Sphinx index file for Chinese documentation.
docs/source/使用指引/NPU的支持.md
- Added Chinese documentation for NPU support.
docs/source/使用指引/安装.md
- Added Chinese documentation for installation.
docs/source/使用指引/快速开始.md
- Added Chinese documentation for quick start.
docs/source/使用指引/服务端和客户端/Tinker兼容客户端.md
- Added Chinese documentation for Tinker compatible client.
docs/source/使用指引/服务端和客户端/Twinkle客户端.md
- Added Chinese documentation for Twinkle client.
docs/source/使用指引/服务端和客户端/index.rst
- Added Chinese documentation index for server and client.
docs/source/使用指引/服务端和客户端/服务端.md
- Added Chinese documentation for server.
docs/source/使用指引/魔搭官方环境.md
- Added Chinese documentation for ModelScope official environment.
docs/source/组件/LRScheduler/CosineWarmupScheduler.md
- Added Chinese documentation for CosineWarmupScheduler.
docs/source/组件/LRScheduler/LinearWarmupScheduler.md
- Added Chinese documentation for LinearWarmupScheduler.
docs/source/组件/LRScheduler/index.rst
- Added Chinese documentation index for LRScheduler.
docs/source/组件/任务处理器/InputProcessor.md
- Added Chinese documentation for InputProcessor.
docs/source/组件/任务处理器/index.rst
- Added Chinese documentation index for task processor.
docs/source/组件/优势/Advantage.md
- Added Chinese documentation for Advantage.
docs/source/组件/优势/GRPOAdvantage.md
- Added Chinese documentation for GRPOAdvantage.
docs/source/组件/优势/RLOOAdvantage.md
- Added Chinese documentation for RLOOAdvantage.
docs/source/组件/优势/index.rst
- Added Chinese documentation index for Advantage.
docs/source/组件/内核/Kernel.md
- Added Chinese documentation for Kernel.
docs/source/组件/内核/index.rst
- Added Chinese documentation index for Kernel.
docs/source/组件/奖励/Reward.md
- Added Chinese documentation for Reward.
docs/source/组件/奖励/index.rst
- Added Chinese documentation index for Reward.
docs/source/组件/指标/Accuracy.md
- Added Chinese documentation for Accuracy metric.
docs/source/组件/指标/LossMetric.md
- Added Chinese documentation for LossMetric.
docs/source/组件/指标/TrainMetric.md
- Added Chinese documentation for TrainMetric.
docs/source/组件/指标/index.rst
- Added Chinese documentation index for metrics.
docs/source/组件/指标/构建指标.md
- Added Chinese documentation for building metrics.
docs/source/组件/损失/CrossEntropy.md
- Added Chinese documentation for CrossEntropyLoss.
docs/source/组件/损失/index.rst
- Added Chinese documentation index for loss.
docs/source/组件/损失/构建损失.md
- Added Chinese documentation for building loss.
docs/source/组件/数据加载/DataLoader.md
- Added Chinese documentation for DataLoader.
docs/source/组件/数据加载/index.rst
- Added Chinese documentation index for data loading.
docs/source/组件/数据格式/InputFeature.md
- Added Chinese documentation for InputFeature.
docs/source/组件/数据格式/Message.md
- Added Chinese documentation for Message.
docs/source/组件/数据格式/ModelOutput.md
- Added Chinese documentation for ModelOutput.
docs/source/组件/数据格式/Output.md
- Added Chinese documentation for Output.
docs/source/组件/数据格式/Sampling.md
- Added Chinese documentation for Sampling.
docs/source/组件/数据格式/Trajectory.md
- Added Chinese documentation for Trajectory.
docs/source/组件/数据格式/index.rst
- Added Chinese documentation index for data format.
docs/source/组件/数据集/Dataset.md
- Added Chinese documentation for Dataset.
docs/source/组件/数据集/IterableDataset.md
- Added Chinese documentation for IterableDataset.
docs/source/组件/数据集/IterablePackingDataset.md
- Added Chinese documentation for IterablePackingDataset.
docs/source/组件/数据集/LazyDataset.md
- Added Chinese documentation for LazyDataset.
docs/source/组件/数据集/PackingDataset.md
- Added Chinese documentation for PackingDataset.
docs/source/组件/数据集/index.rst
- Added Chinese documentation index for dataset.
docs/source/组件/检查点引擎/CheckpointEngine.md
- Added Chinese documentation for CheckpointEngine.
docs/source/组件/检查点引擎/HCCLCheckpointEngine.md
- Added Chinese documentation for HCCLCheckpointEngine.
docs/source/组件/检查点引擎/NCCLCheckpointEngine.md
- Added Chinese documentation for NCCLCheckpointEngine.
docs/source/组件/检查点引擎/index.rst
- Added Chinese documentation index for checkpoint engine.
docs/source/组件/模型/MegatronModel.md
- Added Chinese documentation for MegatronModel.
docs/source/组件/模型/MultiLoraMegatronModel.md
- Added Chinese documentation for MultiLoraMegatronModel.
docs/source/组件/模型/MultiLoraTransformersModel.md
- Added Chinese documentation for MultiLoraTransformersModel.
docs/source/组件/模型/TransformersModel.md
- Added Chinese documentation for TransformersModel.
docs/source/组件/模型/TwinkleModel.md
- Added Chinese documentation for TwinkleModel.
docs/source/组件/模型/index.rst
- Added Chinese documentation index for model.
docs/source/组件/模板/Template.md
- Added Chinese documentation for Template.
docs/source/组件/模板/index.rst
- Added Chinese documentation index for template.
docs/source/组件/组件化/Plugin.md
- Added Chinese documentation for Plugin.
docs/source/组件/组件化/index.rst
- Added Chinese documentation index for plugin.
docs/source/组件/补丁/Patch.md
- Added Chinese documentation for Patch.
docs/source/组件/补丁/index.rst
- Added Chinese documentation index for patch.
docs/source/组件/训练中间件/DeviceMesh和DeviceGroup.md
- Added Chinese documentation for DeviceMesh and DeviceGroup.
docs/source/组件/训练中间件/RemoteClass.md
- Added Chinese documentation for RemoteClass.
docs/source/组件/训练中间件/index.rst
- Added Chinese documentation index for training middleware.
docs/source_en/.readthedocs.yaml
- Added ReadTheDocs configuration for English documentation.
docs/source_en/Components/Advantage/Advantage.md
- Added English documentation for Advantage.
docs/source_en/Components/Advantage/GRPOAdvantage.md
- Added English documentation for GRPOAdvantage.
docs/source_en/Components/Advantage/RLOOAdvantage.md
- Added English documentation for RLOOAdvantage.
docs/source_en/Components/Advantage/index.rst
- Added English documentation index for Advantage.
docs/source_en/Components/Checkpoint Engine/CheckpointEngine.md
- Added English documentation for CheckpointEngine.
docs/source_en/Components/Checkpoint Engine/HCCLCheckpointEngine.md
- Added English documentation for HCCLCheckpointEngine.
docs/source_en/Components/Checkpoint Engine/NCCLCheckpointEngine.md
- Added English documentation for NCCLCheckpointEngine.
docs/source_en/Components/Checkpoint Engine/index.rst
- Added English documentation index for Checkpoint Engine.
docs/source_en/Components/Data Format/InputFeature.md
- Added English documentation for InputFeature.
docs/source_en/Components/Data Format/Message.md
- Added English documentation for Message.
docs/source_en/Components/Data Format/ModelOutput.md
- Added English documentation for ModelOutput.
docs/source_en/Components/Data Format/Output.md
- Added English documentation for Output.
docs/source_en/Components/Data Format/Sampling.md
- Added English documentation for Sampling.
docs/source_en/Components/Data Format/Trajectory.md
- Added English documentation for Trajectory.
docs/source_en/Components/Data Format/index.rst
- Added English documentation index for Data Format.
docs/source_en/Components/Data Loading/DataLoader.md
- Added English documentation for DataLoader.
docs/source_en/Components/Data Loading/index.rst
- Added English documentation index for Data Loading.
docs/source_en/Components/Dataset/Dataset.md
- Added English documentation for Dataset.
docs/source_en/Components/Dataset/IterableDataset.md
- Added English documentation for IterableDataset.
docs/source_en/Components/Dataset/IterablePackingDataset.md
- Added English documentation for IterablePackingDataset.
docs/source_en/Components/Dataset/LazyDataset.md
- Added English documentation for LazyDataset.
docs/source_en/Components/Dataset/PackingDataset.md
- Added English documentation for PackingDataset.
docs/source_en/Components/Dataset/index.rst
- Added English documentation index for Dataset.
docs/source_en/Components/Kernel/Kernel.md
- Added English documentation for Kernel.
docs/source_en/Components/Kernel/index.rst
- Added English documentation index for Kernel.
docs/source_en/Components/LRScheduler/CosineWarmupScheduler.md
- Added English documentation for CosineWarmupScheduler.
docs/source_en/Components/LRScheduler/LinearWarmupScheduler.md
- Added English documentation for LinearWarmupScheduler.
docs/source_en/Components/LRScheduler/index.rst
- Added English documentation index for LRScheduler.
docs/source_en/Components/Loss/Building-Loss.md
- Added English documentation for building loss.
docs/source_en/Components/Loss/CrossEntropy.md
- Added English documentation for CrossEntropyLoss.
docs/source_en/Components/Loss/index.rst
- Added English documentation index for loss.
docs/source_en/Components/Metrics/Accuracy.md
- Added English documentation for Accuracy metric.
docs/source_en/Components/Metrics/Building-Metrics.md
- Added English documentation for building metrics.
docs/source_en/Components/Metrics/LossMetric.md
- Added English documentation for LossMetric.
docs/source_en/Components/Metrics/TrainMetric.md
- Added English documentation for TrainMetric.
docs/source_en/Components/Metrics/index.rst
- Added English documentation index for metrics.
docs/source_en/Components/Model/MegatronModel.md
- Added English documentation for MegatronModel.
docs/source_en/Components/Model/MultiLoraMegatronModel.md
- Added English documentation for MultiLoraMegatronModel.
docs/source_en/Components/Model/MultiLoraTransformersModel.md
- Added English documentation for MultiLoraTransformersModel.
docs/source_en/Components/Model/TransformersModel.md
- Added English documentation for TransformersModel.
docs/source_en/Components/Model/TwinkleModel.md
- Added English documentation for TwinkleModel.
docs/source_en/Components/Model/index.rst
- Added English documentation index for model.
docs/source_en/Components/Patch/Patch.md
- Added English documentation for Patch.
docs/source_en/Components/Patch/index.rst
- Added English documentation index for patch.
docs/source_en/Components/Plugin/Plugin.md
- Added English documentation for Plugin.
docs/source_en/Components/Plugin/index.rst
- Added English documentation index for plugin.
docs/source_en/Components/Preprocessor and Filter/Filter.md
- Added English documentation for Filter.
docs/source_en/Components/Preprocessor and Filter/Preprocessor.md
- Added English documentation for Preprocessor.
docs/source_en/Components/Preprocessor and Filter/index.rst
- Added English documentation index for preprocessor and filter.
docs/source_en/Components/Reward/Reward.md
- Added English documentation for Reward.
docs/source_en/Components/Reward/index.rst
- Added English documentation index for Reward.
docs/source_en/Components/Sampler/Sampler.md
- Added English documentation for Sampler.
docs/source_en/Components/Sampler/TorchSampler.md
- Added English documentation for TorchSampler.
docs/source_en/Components/Sampler/index.rst
- Added English documentation index for sampler.
docs/source_en/Components/Sampler/vLLMSampler.md
- Added English documentation for vLLMSampler.
docs/source_en/Components/Task Processor/InputProcessor.md
- Added English documentation for InputProcessor.
docs/source_en/Components/Task Processor/index.rst
- Added English documentation index for task processor.
docs/source_en/Components/Template/Template.md
- Added English documentation for Template.
docs/source_en/Components/Template/index.rst
- Added English documentation index for template.
docs/source_en/Components/Training Middleware/DeviceMesh-and-DeviceGroup.md
- Added English documentation for DeviceMesh and DeviceGroup.
docs/source_en/Components/Training Middleware/RemoteClass.md
- Added English documentation for RemoteClass.
docs/source_en/Components/Training Middleware/index.rst
- Added English documentation index for training middleware.
docs/source_en/conf.py
- Added a Sphinx configuration file for English documentation.
docs/source_en/index.rst
- Added a Sphinx index file for English documentation.
examples/expert_parallel/train_qwen3_30b_ep_fsdp_demo.py
- Added an expert parallel FSDP training demo for Qwen3-30B.
pyproject.toml
- Updated project dependencies and added optional dependencies for transformers, megatron, vllm, ray, and docs.
src/twinkle/init.py
- Updated lazy import structure to include new utilities and infra modules.
src/twinkle/advantage/init.py
- Added Advantage module, including GRPOAdvantage and RLOOAdvantage, and temporary helper functions.
src/twinkle/advantage/base.py
- Added a base class for advantage functions.
src/twinkle/advantage/grpo.py
- Added GRPO advantage function implementation.
src/twinkle/advantage/rloo.py
- Added RLOO advantage function implementation.
src/twinkle/checkpoint_engine/init.py
- Added CheckpointEngine module, including NCCLCheckpointEngine and HCCLCheckpointEngine, and manager/mixin classes.
src/twinkle/checkpoint_engine/base.py
- Added a base class for checkpoint engines.
src/twinkle/checkpoint_engine/hccl_checkpoint_engine.py
- Added an HCCL-based checkpoint engine for Ascend NPU.
src/twinkle/checkpoint_engine/manager.py
- Added CheckpointEngineManager for weight synchronization.
src/twinkle/checkpoint_engine/mixin.py
- Added CheckpointEngineMixin for integrating checkpoint engine functionality into models/samplers.

Ignored Files

Ignored by pattern: .github/workflows/** (5)
- .github/workflows/citest.yaml
- .github/workflows/citest_npu.yaml
- .github/workflows/close_tale_issue.yaml
- .github/workflows/lint.yaml
- .github/workflows/publish.yaml

Activity

The pull request was created by meichangsu1.
The title 'sp fix ci test hang' indicates an intention to resolve issues causing CI tests to hang.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request addresses a hanging issue in the sequence parallelism CI tests. The change correctly modifies the test workers to use the already initialized sequence parallel process group (sequence_parallel._sp_group) instead of re-creating it. This avoids a redundant and potentially problematic call to _get_sp_group_from_device_mesh. The fix is applied to both _run_worker_single_attn and _run_worker_single_attn_fsdp functions, which should resolve the test instability. The change is correct and improves the robustness of the tests.

Remove `_get_sp_group_from_device_mesh` import from test file as it is no longer used in the test, cleaning up imports and improving code clarity.

meichangsu1 changed the base branch from main to dev February 11, 2026 07:43

gemini-code-assist bot reviewed Feb 11, 2026

View reviewed changes

feat(tests): remove unused import in sequence parallel test

bcfb465

Remove `_get_sp_group_from_device_mesh` import from test file as it is no longer used in the test, cleaning up imports and improving code clarity.

meichangsu1 merged commit 297b312 into dev Feb 11, 2026
0 of 3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sp fix ci test hang#52

sp fix ci test hang#52
meichangsu1 merged 2 commits intodevfrom
seq_unitest_ljl_fix

meichangsu1 commented Feb 11, 2026

Uh oh!

gemini-code-assist bot commented Feb 11, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

meichangsu1 commented Feb 11, 2026

Uh oh!

gemini-code-assist bot commented Feb 11, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant