fix docs

tastelikefeet · tastelikefeet · commit 8ecf2aeae90e · 2026-02-11T14:35:36.000+08:00
diff --git a/docs/source/使用指引/魔搭官方环境.md b/docs/source/使用指引/魔搭官方环境.md
@@ -1,25 +1,37 @@
-# 魔搭免费资源
+# 魔搭官方环境
 
-在 Twinkle 框架开源的同时，我们在[魔搭社区官网](https://www.modelscope.cn)上提供了免费可用的 RL 训练资源。开发者仅需传入 ModelScope SDK token 即可**免费**训练。
+在 Twinkle 框架开源的同时，我们在[魔搭社区官网](https://www.modelscope.cn)上提供了样例训练资源。开发者可以通过ModelScope Token进行训练。
 
-目前在集群中运行的模型是。下面介绍具体的使用方法：
+目前在集群中运行的模型是[Qwen/Qwen3-30B-A3B-Instruct-2507](https://www.modelscope.cn/models/Qwen/Qwen3-30B-A3B-Instruct-2507)。下面介绍具体的使用方法：
 
 ## Step 1. 注册魔搭用户
 
 开发者首先需要注册成为魔搭用户，使用魔搭社区的 token 进行调用。
 
 注册地址：https://www.modelscope.cn/
 
-token 在这里获取：https://www.modelscope.cn/my/access/token 拷贝访问令牌在 SDK 中使用即可。
+token 在这里获取：https://www.modelscope.cn/my/access/token 拷贝访问令牌。
 
 ## Step 2. 加入 twinkle-explorers 组织
 
-目前 twinkle-kit 的远程训练能力在灰度测试中，开发者需要加入 [twinkle-explorers](https://www.modelscope.cn/models/twinkle-explorers) 组织，组织内的用户可以进行前期使用和测试。
-该组织的申请和加入没有门槛，当前仅用于项目上线前期的流量控制和缺陷反馈。在项目稳定后，我们会移除加入组织的限制。
+目前 twinkle-kit 的远程训练能力在灰度测试中，开发者需要加入 [twinkle-explorers](https://www.modelscope.cn/models/twinkle-explorers) 组织，组织内的用户可以进行使用和测试。
 
 ## Step 3. 查看 Cookbook 并二次定制开发
 
 我们强烈推荐开发者查看我们的 [cookbook](https://github.com/modelscope/twinkle/tree/main/cookbook/client)，并根据其中的训练代码进行二次开发。
 
 开发者可以定制数据集/优势函数/奖励/模板等，其中 Loss 部分由于需要在服务端执行，因此当前暂不支持（安全性原因）。
-如果需要支持您的额外 Loss，可以将该 Loss 实现上传到 ModelHub 中，并在答疑群中或者 issue 中联系我们，将对应组件开放白名单即可使用。
+如果需要支持您的额外 Loss，可以将该 Loss 实现上传到 ModelHub 中，并在答疑群中或者 issue 中联系我们，将对应组件开放白名单即可使用。
+
+## 附录：支持的训练方式
+
+该模型为纯文本模型，因此暂不支持多模态任务。在纯文本任务中，你可以训练：
+
+1. PT/SFT的常规训练方法，包含Agentic训练
+2. GRPO/RLOO等自采样RL算法
+3. GKD/On-policy等蒸馏方法，由于魔搭官方端仅支持单模型，因此另一个Teacher/Student模型需要开发者自行准备
+
+当前官方环境仅支持LoRA训练，对LoRA的要求：
+
+1. 最大rank=32
+2. 不支持modules_to_save
diff --git a/docs/source_en/Usage Guide/ModelScope-Free-Resources.md b/docs/source_en/Usage Guide/ModelScope-Free-Resources.md
diff --git a/docs/source_en/Usage Guide/ModelScope-Official-Resources.md b/docs/source_en/Usage Guide/ModelScope-Official-Resources.md
@@ -0,0 +1,38 @@
+# ModelScope Official Environment
+
+Alongside the open-source release of the Twinkle framework, we provide sample training resources on the [ModelScope Community website](https://www.modelscope.cn). Developers can conduct training using their ModelScope Token.
+
+The model currently running on the cluster is [Qwen/Qwen3-30B-A3B-Instruct-2507](https://www.modelscope.cn/models/Qwen/Qwen3-30B-A3B-Instruct-2507). Below are the detailed usage instructions:
+
+## Step 1. Register a ModelScope Account
+
+Developers first need to register as a ModelScope user and use the ModelScope community token for API calls.
+
+Registration URL: https://www.modelscope.cn/
+
+Token can be obtained here: https://www.modelscope.cn/my/access/token — copy your access token.
+
+## Step 2. Join the twinkle-explorers Organization
+
+Currently, the remote training capability of twinkle-kit is in beta testing. Developers need to join the [twinkle-explorers](https://www.modelscope.cn/models/twinkle-explorers) organization. Users within the organization can access and test these features.
+
+## Step 3. Review the Cookbook and Customize Your Development
+
+We strongly recommend that developers review our [cookbook](https://github.com/modelscope/twinkle/tree/main/cookbook/client) and build upon the training code provided there.
+
+Developers can customize datasets, advantage functions, rewards, templates, and more. However, the Loss component is not currently customizable (for security reasons) since it needs to be executed on the server side.
+
+If you need support for additional custom Loss functions, you can upload your Loss implementation to ModelHub and contact us through the Q&A group or via GitHub issues. We will add the corresponding component to the whitelist for your use.
+
+## Appendix: Supported Training Methods
+
+This model is a text-only model, so multimodal tasks are not supported at this time. For text-only tasks, you can train using:
+
+1. Standard PT/SFT training methods, including Agentic training
+2. Self-sampling RL algorithms such as GRPO/RLOO
+3. Distillation methods like GKD/On-policy. Since the official ModelScope environment only supports a single model, developers need to prepare the other Teacher/Student model themselves
+
+The current official environment only supports LoRA training, with the following requirements:
+
+1. Maximum rank = 32
+2. modules_to_save is not supported