modify cookbook (#63)

meichangsu1 · web-flow · commit fc0db7226112 · 2026-02-13T17:36:13.000+08:00
* feat(tests): replace manual sp_group retrieval with module attribute

Replace calls to `_get_sp_group_from_device_mesh` with direct access to `sequence_parallel._sp_group` in sequence parallel attention tests. This simplifies the test setup by using the already initialized group stored in the module, improving code clarity and reducing redundancy.

* feat(tests): improve kernel availability check in test_function_kernel

Add additional imports and a try-except block to verify that the 'kernels-test/flattened-build' kernel can be successfully loaded in the current environment before proceeding with the test. This prevents test failures due to environment-specific loading issues and provides a more informative skip message.

* wip

* wip

* remove debug info

* feat: add ep/sp FSDP MoE finetuning entry and update script

- Add new entry for ep/sp FSDP MoE finetuning in README table
- Update ep_fsdp_qwen3_moe.py script to include ulysses_size parameter for enhanced parallelism configuration
diff --git a/README.md b/README.md
@@ -69,6 +69,7 @@ pip install -e .
 | --------------------------------- | --------------- | ------------------------------------------------- |
 | FSDP finetuning                   | transformers    | [Script](cookbook/transformers/fsdp2.py)             |
 | FSDP MoE finetuning               | transformers    | [Script](cookbook/transformers/fsdp2_moe.py)         |
+| ep/sp FSDP MoE finetuning              | transformers    | [Script](cookbook/transformers/ep_fsdp_qwen3_moe.py)         |
 | EP MoE finetuning                 | transformers    | [Script](cookbook/transformers/ep_fsdp_qwen3_moe.py) |
 | pp/tp/cp finetuning               | megatron        | [Script](cookbook/megatron/tp.py)                    |
 | pp/tp/cp MoE finetuning           | megatron        | [Script](cookbook/megatron/tp_moe.py)                |
diff --git a/cookbook/transformers/ep_fsdp_qwen3_moe.py b/cookbook/transformers/ep_fsdp_qwen3_moe.py
@@ -21,11 +21,13 @@
 # 4 gpus, dp=2, ep=2
 dp_size = 2
 ep_size = 2
+ulysses_size = 2
 
 device_mesh = DeviceMesh(
     device_type=Platform.get_platform().device_prefix(),
     mesh=np.arange(dp_size * ep_size).reshape(dp_size, ep_size),
     mesh_dim_names=('dp', 'ep'),
+    ulysses_size=ulysses_size, # enable sp
 )
 
 twinkle.initialize(