update (#57)

yingdachen · Yingda Chen · web-flow · commit 547555e04efd · 2026-02-13T11:24:55.000+08:00
Co-authored-by: Yingda Chen &lt;yingda.chen@alibaba-inc.com&gt;
diff --git a/README.md b/README.md
@@ -51,7 +51,7 @@ be reused in [ms-swift](https://github.com/modelscope/ms-swift).
 pip install 'twinkle-kit'
 ```
 
-### Installation from Source:
+### Install from Source:
 
 ```shell
 git clone https://github.com/modelscope/twinkle.git
@@ -75,13 +75,14 @@ pip install -e .
 
 ## Changelog
 
-- 🎉2026-02-10 Initial version of Twinkle✨ released, including SFT/PT/RL for text models and serverless training capabilities on [ModelScope](https://modelscope.cn).
+- 🎉2026-02-13 Initial version of Twinkle✨ released, including SFT/PT/RL support for text models and serverless training capabilities on [ModelScope](https://modelscope.cn).
 
-# ModelScope Community
+## Training as a Service on ModelScope
 
-## ModelScope Official Environment
-
-The ModelScope community provides an official environment for running Twinkle. The API endpoint is: [base_url](https://www.modelscope.cn/twinkle). Developers can refer to our [documentation](docs/source_en/Usage%20Guide/ModelScope-Official-Resources.md) for usage instructions.
+We are rolling out training service built atop Twinkle✨ on ModelScope. It is currently in _Beta_. You may
+sign up for free access by joining the [Twinkle-Explorers](https://modelscope.cn/organization/twinkle-explorers) organization, and
+train via API endpoint  `base_url=https://www.modelscope.cn/twinkle`. For more details, please refer to 
+our [documentation](docs/source_en/Usage%20Guide/ModelScope-Official-Resources.md).
 
 ## Supported Hardware
 
@@ -95,29 +96,33 @@ The ModelScope community provides an official environment for running Twinkle. T
 ## Supported Models
 
 We will be adding support for more models as new models are released. The following table lists current models
-supported on Twinkle✨ framework. However, the models supported on our serverless training backend may be a
-much smaller subset. Please refer to the [doc](link) section for more information.
-
-| Model Type          | Model ID on[ModelScope](https://modelscope.cn)                                                                           | Requires             | Megatron Support | HF Model ID                                                                                                |
-| ------------------- | --------------------------------------------------------------------------------------------------------------------- | -------------------- | ---------------- | ---------------------------------------------------------------------------------------------------------- |
-| qwen3 series        | [Qwen/Qwen3-0.6B-Base](https://modelscope.cn/models/Qwen/Qwen3-0.6B-Base)~32B                                            | transformers>=4.51   | ✔               | [Qwen/Qwen3-0.6B-Base](https://huggingface.co/Qwen/Qwen3-0.6B-Base)                                           |
-| qwen3_moe series    | [Qwen/Qwen3-30B-A3B-Base](https://modelscope.cn/models/Qwen/Qwen3-30B-A3B-Base)                                          | transformers>=4.51   | ✔               | [Qwen/Qwen3-30B-A3B-Base](https://huggingface.co/Qwen/Qwen3-30B-A3B-Base)                                     |
-|                     | [Qwen/Qwen3-30B-A3B](https://modelscope.cn/models/Qwen/Qwen3-30B-A3B)~235B                                               | transformers>=4.51   | ✔               | [Qwen/Qwen3-30B-A3B](https://huggingface.co/Qwen/Qwen3-30B-A3B)                                               |
-| qwen2 series        | [Qwen/Qwen2-0.5B-Instruct](https://modelscope.cn/models/Qwen/Qwen2-0.5B-Instruct) ~72B                                   | transformers>=4.37   | ✔               | [Qwen/Qwen2-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2-0.5B-Instruct)                                   |
-|                     | [Qwen/Qwen2.5-0.5B-Instruct](https://modelscope.cn/models/Qwen/Qwen2.5-0.5B-Instruct)~72B                                | transformers>=4.37   | ✔               | [Qwen/Qwen2.5-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct)                               |
-|                     | [Qwen/Qwen2.5-0.5B](https://modelscope.cn/models/Qwen/Qwen2.5-0.5B)~72B                                                  | transformers>=4.37   | ✔               | [Qwen/Qwen2.5-0.5B](https://huggingface.co/Qwen/Qwen2.5-0.5B)                                                 |
-| qwen2_moe series    | [Qwen/Qwen1.5-MoE-A2.7B-Chat](https://modelscope.cn/models/Qwen/Qwen1.5-MoE-A2.7B-Chat)                                  | transformers>=4.40   | ✔               | [Qwen/Qwen1.5-MoE-A2.7B-Chat](https://huggingface.co/Qwen/Qwen1.5-MoE-A2.7B-Chat)                             |
+supported on Twinkle✨ framework. 
+
+>[!Note] 
+> For serverless training service accessed via `base_url=https://www.modelscope.cn/twinkle`, it currently supports
+> one training base at a time, and currently it is [Qwen3-30B-A3B-Instruct-2507](https://modelscope.cn/models/Qwen/Qwen3-30B-A3B-Instruct-2507).
+
+
+| Model Type          | Model ID on [ModelScope](https://modelscope.cn)                                                                          | Requires             | Megatron Support | HF Model ID                                                                                                |
+| ------------------- |--------------------------------------------------------------------------------------------------------------------------| -------------------- | ---------------- | ---------------------------------------------------------------------------------------------------------- |
+| qwen3 series        | [Qwen/Qwen3-0.6B-Base](https://modelscope.cn/models/Qwen/Qwen3-0.6B-Base)~32B                                            | transformers>=4.51   | ✅               | [Qwen/Qwen3-0.6B-Base](https://huggingface.co/Qwen/Qwen3-0.6B-Base)                                           |
+| qwen3_moe series    | [Qwen/Qwen3-30B-A3B-Base](https://modelscope.cn/models/Qwen/Qwen3-30B-A3B-Base)                                          | transformers>=4.51   | ✅               | [Qwen/Qwen3-30B-A3B-Base](https://huggingface.co/Qwen/Qwen3-30B-A3B-Base)                                     |
+|                     | [Qwen/Qwen3-30B-A3B](https://modelscope.cn/models/Qwen/Qwen3-30B-A3B)~235B                                               | transformers>=4.51   | ✅               | [Qwen/Qwen3-30B-A3B](https://huggingface.co/Qwen/Qwen3-30B-A3B)                                               |
+| qwen2 series        | [Qwen/Qwen2-0.5B-Instruct](https://modelscope.cn/models/Qwen/Qwen2-0.5B-Instruct) ~72B                                   | transformers>=4.37   | ✅               | [Qwen/Qwen2-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2-0.5B-Instruct)                                   |
+|                     | [Qwen/Qwen2.5-0.5B-Instruct](https://modelscope.cn/models/Qwen/Qwen2.5-0.5B-Instruct)~72B                                | transformers>=4.37   | ✅               | [Qwen/Qwen2.5-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct)                               |
+|                     | [Qwen/Qwen2.5-0.5B](https://modelscope.cn/models/Qwen/Qwen2.5-0.5B)~72B                                                  | transformers>=4.37   | ✅               | [Qwen/Qwen2.5-0.5B](https://huggingface.co/Qwen/Qwen2.5-0.5B)                                                 |
+| qwen2_moe series    | [Qwen/Qwen1.5-MoE-A2.7B-Chat](https://modelscope.cn/models/Qwen/Qwen1.5-MoE-A2.7B-Chat)                                  | transformers>=4.40   | ✅               | [Qwen/Qwen1.5-MoE-A2.7B-Chat](https://huggingface.co/Qwen/Qwen1.5-MoE-A2.7B-Chat)                             |
 | chatglm4 series     | [ZhipuAI/glm-4-9b-chat](https://modelscope.cn/models/ZhipuAI/glm-4-9b-chat)                                              | transformers>=4.42   | ✘               | [zai-org/glm-4-9b-chat](https://huggingface.co/zai-org/glm-4-9b-chat)                                         |
 |                     | [ZhipuAI/LongWriter-glm4-9b](https://modelscope.cn/models/ZhipuAI/LongWriter-glm4-9b)                                    | transformers>=4.42   | ✘               | [zai-org/LongWriter-glm4-9b](https://huggingface.co/zai-org/LongWriter-glm4-9b)                               |
 | glm_edge series     | [ZhipuAI/glm-edge-1.5b-chat](https://modelscope.cn/models/ZhipuAI/glm-edge-1.5b-chat)                                    | transformers>=4.46   | ✘               | [zai-org/glm-edge-1.5b-chat](https://huggingface.co/zai-org/glm-edge-1.5b-chat)                               |
 |                     | [ZhipuAI/glm-edge-4b-chat](https://modelscope.cn/models/ZhipuAI/glm-edge-4b-chat)                                        | transformers>=4.46   | ✘               | [zai-org/glm-edge-4b-chat](https://huggingface.co/zai-org/glm-edge-4b-chat)                                   |
 | internlm2 series    | [Shanghai_AI_Laboratory/internlm2-1_8b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-1_8b)              | transformers>=4.38   | ✘               | [internlm/internlm2-1_8b](https://huggingface.co/internlm/internlm2-1_8b)                                     |
 |                     | [Shanghai_AI_Laboratory/internlm2-chat-7b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-chat-7b)        | transformers>=4.38   | ✘               | [internlm/internlm2-chat-7b](https://huggingface.co/internlm/internlm2-chat-7b)                               |
-| deepseek_v1         | [deepseek-ai/deepseek-vl-7b-chat](https://modelscope.cn/models/deepseek-ai/deepseek-vl-7b-chat)                          | transformers>=4.39.4 | ✔               | ——                                                                                                       |
-|                     | [deepseek-ai/DeepSeek-V2-Lite](https://modelscope.cn/models/deepseek-ai/DeepSeek-V2-Lite)                                | transformers>=4.39.3 | ✔               | [deepseek-ai/DeepSeek-V2-Lite](https://huggingface.co/deepseek-ai/DeepSeek-V2-Lite)                           |
-|                     | [deepseek-ai/DeepSeek-V2.5](https://modelscope.cn/models/deepseek-ai/DeepSeek-V2.5)                                      | transformers>=4.39.3 | ✔               | [deepseek-ai/DeepSeek-V2.5](https://huggingface.co/deepseek-ai/DeepSeek-V2.5)                                 |
-|                     | [deepseek-ai/DeepSeek-R1](https://modelscope.cn/models/deepseek-ai/DeepSeek-R1)                                          | transformers>=4.39.3 | ✔               | [deepseek-ai/DeepSeek-R1](https://huggingface.co/deepseek-ai/DeepSeek-R1)                                     |
-| deepSeek-r1-distill | [deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B](https://modelscope.cn/models/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B) ~32B | transformers>=4.37   | ✔               | [deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B) |
+| deepseek_v1         | [deepseek-ai/deepseek-vl-7b-chat](https://modelscope.cn/models/deepseek-ai/deepseek-vl-7b-chat)                          | transformers>=4.39.4 | ✅               | ——                                                                                                       |
+|                     | [deepseek-ai/DeepSeek-V2-Lite](https://modelscope.cn/models/deepseek-ai/DeepSeek-V2-Lite)                                | transformers>=4.39.3 | ✅               | [deepseek-ai/DeepSeek-V2-Lite](https://huggingface.co/deepseek-ai/DeepSeek-V2-Lite)                           |
+|                     | [deepseek-ai/DeepSeek-V2.5](https://modelscope.cn/models/deepseek-ai/DeepSeek-V2.5)                                      | transformers>=4.39.3 | ✅               | [deepseek-ai/DeepSeek-V2.5](https://huggingface.co/deepseek-ai/DeepSeek-V2.5)                                 |
+|                     | [deepseek-ai/DeepSeek-R1](https://modelscope.cn/models/deepseek-ai/DeepSeek-R1)                                          | transformers>=4.39.3 | ✅               | [deepseek-ai/DeepSeek-R1](https://huggingface.co/deepseek-ai/DeepSeek-R1)                                     |
+| deepSeek-r1-distill | [deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B](https://modelscope.cn/models/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B) ~32B | transformers>=4.37   | ✅               | [deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B) |
 
 For a more detailed model support list 👉  [Quick Start.md](https://github.com/modelscope/twinkle/blob/dev/docs/source/%E4%BD%BF%E7%94%A8%E6%8C%87%E5%BC%95/%E5%BF%AB%E9%80%9F%E5%BC%80%E5%A7%8B.md)
 
@@ -141,18 +146,20 @@ twinkle.initialize(mode='ray', groups=device_group, global_device_mesh=device_me
 
 
 def train():
+    # to load model from Hugging Face, use 'hf://...'
+    base_model = 'ms://Qwen/Qwen2.5-7B-Instruct'
     # 1000 samples
     dataset = Dataset(dataset_meta=DatasetMeta('ms://swift/self-cognition', data_slice=range(1000)))
     # Set template to prepare encoding
-    dataset.set_template('Template', model_id='ms://Qwen/Qwen2.5-7B-Instruct')
+    dataset.set_template('Template', model_id=base_model)
     # Preprocess the dataset to standard format
     dataset.map(SelfCognitionProcessor('twinkle LLM', 'ModelScope Community'))
     # Encode dataset
     dataset.encode()
     # Global batch size = 8, for GPUs, so 1 sample per GPU
     dataloader = DataLoader(dataset=dataset, batch_size=8, min_batch_size=8)
     # Use a TransformersModel
-    model = TransformersModel(model_id='ms://Qwen/Qwen2.5-7B-Instruct', remote_group='default')
+    model = TransformersModel(model_id=base_model, remote_group='default')
 
     lora_config = LoraConfig(
         r=8,
@@ -184,7 +191,7 @@ if __name__ == '__main__':
     train()
 ```
 
-### Tinker-Like Remote API
+### Using Tinker-Like API
 
 ```python
 import os
@@ -196,17 +203,19 @@ from twinkle.dataset import Dataset, DatasetMeta
 from twinkle.preprocessor import SelfCognitionProcessor
 from twinkle.server.tinker.common import input_feature_to_datum
 
-base_model = "Qwen/Qwen2.5-0.5B-Instruct"
+base_model = 'ms://Qwen/Qwen2.5-0.5B-Instruct'
+base_url='http://www.modelscope.cn/twinkle'
+api_key=os.environ.get('MODELSCOPE_TOKEN')
 
 # Use twinkle dataset to load the data
 dataset = Dataset(dataset_meta=DatasetMeta('ms://swift/self-cognition', data_slice=range(500)))
-dataset.set_template('Template', model_id=f'ms://{base_model}', max_length=256)
+dataset.set_template('Template', model_id=base_model, max_length=256)
 dataset.map(SelfCognitionProcessor('twinkle Model', 'twinkle Team'), load_from_cache_file=False)
 dataset.encode(batched=True, load_from_cache_file=False)
 dataloader = DataLoader(dataset=dataset, batch_size=8)
 
 # Initialize tinker client
-service_client = init_tinker_compat_client(base_url='http://www.modelscope.cn/twinkle', api_key=os.environ.get('MODELSCOPE_SDK_TOKEN'))
+service_client = init_tinker_compat_client(base_model, api_key)
 training_client = service_client.create_lora_training_client(base_model=base_model, rank=16)
 
 # Training loop: use input_feature_to_datum to transfer the input format
@@ -223,12 +232,6 @@ for epoch in range(3):
     training_client.save_state(f"twinkle-lora-{epoch}").result()
 ```
 
-Launch training:
-
-```shell
-python3 train.py
-```
-
 ## Architecture Design
 
 <img src="assets/framework.jpg" style="max-width: 500px; width: 100%;" />
diff --git a/README_ZH.md b/README_ZH.md
@@ -69,7 +69,7 @@ pip install -e .
 
 ## 魔搭社区官方环境
 
-魔搭社区提供了Twinkle运行的官方环境，调用端点为:[base_url](https://www.modelscope.cn/twinkle)，开发者可以参考我们的[文档](docs/source_zh/使用指引/魔搭官方环境.md)来进行使用。
+魔搭社区提供了Twinkle运行的官方环境，调用端点为:[base_url](https://www.modelscope.cn/twinkle)，开发者可以参考我们的[文档](docs/source_zh/使用指引/训练服务.md)来进行使用。
 
 ## 支持的硬件
 
diff --git a/docs/source_zh/使用指引/快速开始.md b/docs/source_zh/使用指引/快速开始.md
@@ -1,17 +1,12 @@
-<div align="center">
 
 ## ✨ Twinkle 是什么？
 
 大模型训练组件库。基于 PyTorch，更简洁、更灵活、生产就绪。
 
-<p align="center">
 🧩 <b>松耦合架构</b> · 标准化接口<br>
 🚀 <b>多运行模式</b> · torchrun / Ray / HTTP<br>
 🔌 <b>多框架兼容</b> · Transformers / Megatron<br>
 👥 <b>多租户支持</b> · 单基座模型部署
-</p>
-
-</div>
 
 ## Twinkle 适配性
 
@@ -23,7 +18,6 @@ Twinkle 和 [ms-swift](https://github.com/modelscope/ms-swift) 都是模型训
 - 如果你是大模型研究者，希望定制模型或训练方法
 - 如果你善于编写 training loop，希望定制训练过程
 - 如果你希望提供企业级或商业化训练平台
-- 如果你缺少训练硬件，希望使用社区资源
 
 ### 何时选择ms-swift
 
diff --git a/docs/source_zh/使用指引/训练服务.md b/docs/source_zh/使用指引/训练服务.md
@@ -1,24 +1,23 @@
-# 魔搭官方环境
+# ModelScope上的Twinkle训练服务
 
-在 Twinkle 框架开源的同时，我们在[魔搭社区官网](https://www.modelscope.cn)上提供了样例训练资源。开发者可以通过ModelScope Token进行训练。
+在 Twinkle 框架开源的同时，我们依托ModelScope的后台服务，也提供了托管的模型训练服务(Training as a Service），开发者可以通过这一服务，
+免费体验Twinkle的训练API。
 
 目前在集群中运行的模型是[Qwen/Qwen3-30B-A3B-Instruct-2507](https://www.modelscope.cn/models/Qwen/Qwen3-30B-A3B-Instruct-2507)。下面介绍具体的使用方法：
 
-## Step 1. 注册魔搭用户
+## Step 1. 注册ModelScope用户并申请加入 twinkle-explorers 组织
 
-开发者首先需要注册成为魔搭用户，使用魔搭社区的 token 进行调用。
+开发者首先需要注册成为ModelScope用户，并申请加入 [Twinkle-Explorers](https://modelscope.cn/organization/twinkle-explorers) 组织，
+来获取访问权限。当前免费的Serverless训练体验，还在灰度测试中，暂时只向组织内的用户开放。您也可以通过本地部署服务，来使用Twinkle✨。
 
 注册地址：https://www.modelscope.cn/
 
-调用端点：https://www.modelscope.cn/twinkle
+在注册并获批加入[Twinkle-Explorers](https://modelscope.cn/organization/twinkle-explorers) 组织后，在此页面获取
+访问的API-Key（即ModelScope平台的访问Token）：https://www.modelscope.cn/my/access/token 。
 
-token 在这里获取：https://www.modelscope.cn/my/access/token 拷贝访问令牌。
+调用端点：`base_url="https://www.modelscope.cn/twinkle"`
 
-## Step 2. 加入 twinkle-explorers 组织
-
-目前 twinkle-kit 的远程训练能力在灰度测试中，开发者需要加入 [twinkle-explorers](https://www.modelscope.cn/models/twinkle-explorers) 组织，组织内的用户可以进行使用和测试。
-
-## Step 3. 查看 Cookbook 并二次定制开发
+## Step 2. 查看 Cookbook 并二次定制开发
 
 我们强烈推荐开发者查看我们的 [cookbook](https://github.com/modelscope/twinkle/tree/main/cookbook/client/)，并根据其中的训练代码进行二次开发。