we already support this feature for legacy models and qwen2 since [PR](https://github.com/alibaba/ChatLearn/pull/92). but models of mcore format might break.