Description I am currently using verl versions 0.6.1 and 0.7.0. My goal is to integrate the MiMo-V2-Flash model for RL training.
Questions
Should I use the mbridge (Megatron-Core integration) scheme for this model?
Do you have any specific suggestions or best practices for integrating a model with this architecture?
Context
VeRL Version: 0.6.1 / 0.7.0
Target Model: MiMo-V2-Flash
Current Plan: Exploring the mbridge / Megatron-Core path.
Any advice would be appreciated. Thanks!
@ISEEKYAN