Skip to content

Commit 1b2ab01

Browse files
authored
Merge branch 'main' into npu_adapt_doc
2 parents 09c6202 + 4d6ebeb commit 1b2ab01

116 files changed

Lines changed: 3348 additions & 1337 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

README.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ by <a href="https://modelscope.cn/home">ModelScope</a>
1919
</p>
2020

2121
<p align="center">
22-
<a href="https://twinkle-kit.readthedocs.io/en/latest/">English Documentation</a> &nbsp | &nbsp <a href="https://twinkle-kit.readthedocs.io/zh-cn/latest/">中文文档</a> &nbsp
22+
<a href="https://twinkle-kit.readthedocs.io/en/latest/">English Documentation</a> &nbsp | &nbsp <a href="https://twinkle-kit.readthedocs.io/zh-cn/latest/">中文文档</a> &nbsp | &nbsp <a href="https://modelscope.github.io/twinkle-web/">Twinkle Web</a> &nbsp
2323
</p>
2424

2525
## ✨ What is Twinkle?
@@ -101,9 +101,9 @@ Or use ModelScope's [official image](https://www.modelscope.cn/docs/intro/enviro
101101

102102
## Changelog
103103

104+
- 🎉2026-03-24 Twinkle Web site is now live at https://modelscope.github.io/twinkle-web/
105+
- 🎉2026-03-19 Support GKD training ,please refer to this [cookbook](cookbook/rl/gkd_on_policy.py).
104106
- 🎉2026-02-13 Initial version of Twinkle✨ released, including SFT/PT/RL support for text models.
105-
We also made available serverless training capabilities on [ModelScope](https://modelscope.cn) via
106-
Tinker-compatible APIs.
107107

108108
## Training as a Service on ModelScope
109109

@@ -130,7 +130,7 @@ supported on Twinkle✨ framework.
130130
> For serverless training service accessed via `base_url=https://www.modelscope.cn/twinkle`, it
131131
> is currently provided via the Tinker-compatible APIs. We will be rolling out services that support
132132
> both Tinker APIs, as well as the full-fledged Twinkle✨ native APIs. The serverless endpoint is backed
133-
> by one training base at a time, and currently it is [Qwen3-30B-A3B-Instruct-2507](https://modelscope.cn/models/Qwen/Qwen3-30B-A3B-Instruct-2507).
133+
> by one training base at a time, and currently it is [Qwen3.5-4B](https://modelscope.cn/models/Qwen/Qwen3.5-4B).
134134
135135
| Model Type | Model ID on [ModelScope](https://modelscope.cn) | Model Size | Requires | Support Megatron | HF Model ID |
136136
|---------------------|-----------------------------------------------------------------------------------------------------------------|:---------------------------------------:|----------------------|:----------------:|:---------------------------------------------------------------------------------------------------------:|
@@ -235,7 +235,7 @@ from twinkle.dataset import Dataset, DatasetMeta
235235
from twinkle.preprocessor import SelfCognitionProcessor
236236
from twinkle.server.common import input_feature_to_datum
237237

238-
base_model = 'ms://Qwen/Qwen3-30B-A3B-Instruct-2507'
238+
base_model = 'ms://Qwen/Qwen3.5-4B'
239239
base_url='your-base-url'
240240
api_key='your-api-key'
241241

README_ZH.md

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@
1919
</p>
2020

2121
<p align="center">
22-
<a href="https://twinkle-kit.readthedocs.io/en/latest/">英文文档</a> &nbsp | &nbsp <a href="https://twinkle-kit.readthedocs.io/zh-cn/latest/">中文文档</a> &nbsp
22+
<a href="https://twinkle-kit.readthedocs.io/en/latest/">英文文档</a> &nbsp | &nbsp <a href="https://twinkle-kit.readthedocs.io/zh-cn/latest/">中文文档</a> &nbsp | &nbsp <a href="https://modelscope.github.io/twinkle-web/">Twinkle 站点</a> &nbsp
2323
</p>
2424

2525
## ✨ Twinkle 是什么?
@@ -91,6 +91,8 @@ Twinkle✨支持相同的算法接口运行在单GPU、torchrun多机、Ray、Cl
9191

9292
## 更新日志
9393

94+
🎉2026-03-24 Twinkle 站点上线,访问地址 https://modelscope.github.io/twinkle-web/
95+
🎉2026-03-19 支持GKD蒸馏能力,参考[cookbook](cookbook/rl/gkd_on_policy.py)
9496
🎉2026-02-13 Twinkle✨ 初始版本发布,支持文本模型的SFT/PT/RL训练。我们还通过兼容Tinker的API,在魔搭社区上提供了无服务器训练功能。
9597

9698
## ModelScope 的训练服务
@@ -111,7 +113,7 @@ Twinkle✨支持相同的算法接口运行在单GPU、torchrun多机、Ray、Cl
111113
随着新模型的发布,我们将添加对更多模型的支持。下表列出了 Twinkle✨ 框架当前支持的模型。
112114

113115
>[!Note]
114-
> 通过 `base_url=https://www.modelscope.cn/twinkle` 访问的无服务器训练服务,目前是通过兼容Tinker的API提供的。我们将陆续推出同时支持Tinker API和完整Twinkle✨原生 API的服务。无服务器端点每次由一个训练基座支持,目前使用的是[Qwen3-30B-A3B-Instruct-2507](https://modelscope.cn/models/Qwen/Qwen3-30B-A3B-Instruct-2507)
116+
> 通过 `base_url=https://www.modelscope.cn/twinkle` 访问的无服务器训练服务,目前是通过兼容Tinker的API提供的。我们将陆续推出同时支持Tinker API和完整Twinkle✨原生 API的服务。无服务器端点每次由一个训练基座支持,目前使用的是[Qwen3.5-4B](https://modelscope.cn/models/Qwen/Qwen3.5-4B)
115117
116118
| Model Type | Model ID 举例 | Model Size | Requires | Support Megatron | HF Model ID |
117119
|---------------------|-----------------------------------------------------------------------------------------------------------------|:---------------------------------------:|----------------------|:----------------:|:---------------------------------------------------------------------------------------------------------:|
@@ -215,7 +217,7 @@ from twinkle.dataset import Dataset, DatasetMeta
215217
from twinkle.preprocessor import SelfCognitionProcessor
216218
from twinkle.server.common import input_feature_to_datum
217219

218-
base_model = 'ms://Qwen/Qwen3-30B-A3B-Instruct-2507'
220+
base_model = 'ms://Qwen/Qwen3.5-4B'
219221
base_url='your-base-url'
220222
api_key='your-api-key'
221223

assets/slogan.png

206 KB
Loading

client_tools/client_generator.py

Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -768,7 +768,7 @@ def sample(
768768
adapter_name: str = '',
769769
adapter_uri: Optional[str] = None,
770770
num_samples: int = 1,
771-
) -> SampleResponseModel:
771+
) -> List[SampleResponseModel]:
772772
"""Sample from the model.
773773
774774
Args:
@@ -795,7 +795,7 @@ def sample(
795795
json_data=json_data
796796
)
797797
response.raise_for_status()
798-
return SampleResponseModel(**response.json())
798+
return [SampleResponseModel(**r) for r in response.json()['samples']]
799799
800800
def set_template(self, template_cls: str, adapter_name: str = '', **kwargs) -> SetTemplateResponse:
801801
"""Set the template for encoding trajectories."""
@@ -805,6 +805,14 @@ def set_template(self, template_cls: str, adapter_name: str = '', **kwargs) -> S
805805
)
806806
response.raise_for_status()
807807
return SetTemplateResponse(**response.json())
808+
809+
def apply_patch(self, patch_cls: str, **kwargs) -> None:
810+
"""Apply a patch to the model."""
811+
response = http_post(
812+
url=f'{self.server_url}/apply_patch',
813+
json_data={'patch_cls': patch_cls, 'adapter_name': self.adapter_name, **kwargs}
814+
)
815+
response.raise_for_status()
808816
'''
809817

810818
# Write the sampler client file
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
export RAY_ROTATION_MAX_BYTES=1024
2+
export RAY_ROTATION_BACKUP_COUNT=1
3+
CUDA_VISIBLE_DEVICES=0,1,2,3 ray start --head --port=6379 --num-gpus=4 --disable-usage-stats --include-dashboard=false
4+
CUDA_VISIBLE_DEVICES=4,5,6,7 ray start --address=127.0.0.1:6379 --num-gpus=4
5+
CUDA_VISIBLE_DEVICES="" ray start --address=127.0.0.1:6379 --num-gpus=0
6+
python server.py

cookbook/client/server/megatron/server_config.yaml

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -36,11 +36,11 @@ applications:
3636

3737
# 3. Sampler Service - Runs inference / sampling using vLLM engine
3838
# Used for generating text from the model (e.g., evaluating LoRA results).
39-
- name: sampler-Qwen3-30B-A3B-Instruct-2507
40-
route_prefix: /api/v1/sampler/Qwen/Qwen3-30B-A3B-Instruct-2507
39+
- name: sampler-Qwen3.5-4B
40+
route_prefix: /api/v1/sampler/Qwen/Qwen3.5-4B
4141
import_path: sampler
4242
args:
43-
model_id: "ms://Qwen/Qwen3-30B-A3B-Instruct-2507" # ModelScope model identifier
43+
model_id: "ms://Qwen/Qwen3.5-4B" # ModelScope model identifier
4444
nproc_per_node: 4 # Number of GPU processes per node
4545
sampler_type: vllm # Inference engine: 'vllm' (fast) or 'torch' (TorchSampler)
4646
engine_args: # vLLM engine-specific settings
@@ -73,12 +73,12 @@ applications:
7373

7474
# 2. Model Service (commented out) - Would host the base model for training.
7575
# Uncomment and configure if you need a training model worker.
76-
- name: models-Qwen3-30B-A3B-Instruct-2507
77-
route_prefix: /api/v1/model/Qwen/Qwen3-30B-A3B-Instruct-2507
76+
- name: models-Qwen3.5-4B
77+
route_prefix: /api/v1/model/Qwen/Qwen3.5-4B
7878
import_path: model
7979
args:
8080
use_megatron: true # Use HuggingFace Transformers backend
81-
model_id: "ms://Qwen/Qwen3-30B-A3B-Instruct-2507" # ModelScope model identifier
81+
model_id: "ms://Qwen/Qwen3.5-4B" # ModelScope model identifier
8282
max_length: 16000 # model max length
8383
max_loras: 5 # model max loras
8484
nproc_per_node: 4 # Number of GPU processes per node

cookbook/client/server/megatron/server_config_4b.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,7 @@ applications:
3939
import_path: model
4040
args:
4141
use_megatron: true
42+
model_cls: Qwen3_5ForConditionalGeneration
4243
model_id: "ms://Qwen/Qwen3.5-4B" # ModelScope model identifier
4344
max_length: 10240
4445
nproc_per_node: 2 # Number of GPU processes per node
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
export RAY_ROTATION_MAX_BYTES=1024
2+
export RAY_ROTATION_BACKUP_COUNT=1
3+
CUDA_VISIBLE_DEVICES=0,1,2,3 ray start --head --port=6379 --num-gpus=4 --disable-usage-stats --include-dashboard=false
4+
CUDA_VISIBLE_DEVICES=4,5,6,7 ray start --address=127.0.0.1:6379 --num-gpus=4
5+
CUDA_VISIBLE_DEVICES="" ray start --address=127.0.0.1:6379 --num-gpus=0
6+
python server.py

cookbook/client/tinker/modelscope/sample.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@
1616

1717
from tinker import ServiceClient
1818

19-
base_model = 'Qwen/Qwen3-30B-A3B-Instruct-2507'
19+
base_model = 'Qwen/Qwen3.5-4B'
2020
base_url = 'http://www.modelscope.cn/twinkle'
2121

2222
# Step 2: Define the base model and connect to the server
@@ -29,7 +29,7 @@
2929
# The model_path is a twinkle:// URI pointing to a previously saved LoRA checkpoint.
3030
# The server will load the base model and apply the LoRA adapter weights.
3131
sampling_client = service_client.create_sampling_client(
32-
model_path='twinkle://xxx-Qwen_Qwen3-30B-A3B-Instruct-2507-xxx/weights/twinkle-lora-1',
32+
model_path='twinkle://xxx-Qwen_Qwen3.5-4B-xxx/weights/twinkle-lora-1',
3333
base_model=base_model
3434
)
3535

cookbook/client/tinker/modelscope/self_cognition.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@
2323
from tinker import ServiceClient
2424

2525
# The base model to fine-tune / evaluate
26-
base_model = 'Qwen/Qwen3-30B-A3B-Instruct-2507'
26+
base_model = 'Qwen/Qwen3.5-4B'
2727
base_url = 'http://www.modelscope.cn/twinkle'
2828

2929

0 commit comments

Comments
 (0)