-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Open
Labels
Description
Hi, we would like share the GRPO RL training recipe based on cosyvoice2 llm.
Recipe: https://github.com/nvidia-china-sae/mair-hub/tree/main/rl-tutorial/cosyvoice_llm
Here are the initial training results:
| Model | Seed-TTS test_zh CER |
Cosyvoice3 zero_shot_zh |
Comment |
|---|---|---|---|
| Official CosyVoice2 LLM | 1.45 % | 4.08% | See the paper |
| + GRPO | 1.37% | 3.36% | |
| SFT (initialized from Qwen2-0.5B-Instruct) | 1.81 % | 4.83% | See PR #1887 |
| + GRPO | 1.06 % | 4.03% |
We will add more experimental results as we continue refining the recipe.
Pydataman, HaiFengZeng, charlesliucn, zhuangweiji, Reza2kn and 11 more