Skip to content

wutaiqiang/MI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Official code for paper Revisiting Model Interpolation for Efficient Reasoning

[📜 Paper][🐱 GitHub]

Environment

Please follow the official guidance of Opencompass to set up a python environment.

We use the lmdeploy backend, please remember to set

pip install "opencompass[lmdeploy]"

Weights

Download the official weights from huggingface:

We recommend to download via the huggingface-cli, such as

hf download Qwen/Qwen3-30B-A3B-Thinking-2507 --token $your_hf_token --local-dir weights/Qwen3-30B-A3B/Qwen3-30B-A3B-Thinking-2507

hf download Qwen/Qwen3-30B-A3B-Instruct-2507 --token $your_hf_token --local-dir weights/Qwen3-30B-A3B/Qwen3-30B-A3B-Instruct-2507

Then, run the mi.py:

python mi.py --model_b /path/to/your/projects/base_model --model_i /path/to/your/projects/finetuned_model --lambda_val 0.5 --output_dir /path/to/your/projects/merged_output

where lambda_val is the interpolation factor.

Evaluation

We employ the opencompass for evaluation.

You need to modify the config files first.

For example, in evaluation/qwen3_AIME.py, replace the paths with your folder, modify the gpus to fit your machine.

Then all you need is to run opencompass evaluation/qwen3_AIME.py and wait the final results.

Warnning: In this repo, we benchmark the Instruct-2507/Thinking-2507 version, which do not require setting 'enable_thinking'. If you try to evaluate the Qwen3 hybrid thinking model, such as Qwen3-4B, please fix the bugs in Opencompass and pass an extra enable_thinking following this repo.

License

We use the Apache‑2.0 license. Please also comply with the licenses of any upstream models and datasets.

☕️ Citation

If you find this repository helpful, please consider citing our paper:

@article{wu2025revisiting,
  title={Revisiting Model Interpolation for Efficient Reasoning},
  author={Wu, Taiqiang and Yang, Runming and Liu, Tao and Wang, Jiahao and Wong, Ngai.},
  journal={arXiv preprint arXiv:2510.10977},
  year={2025}
}

For any questions, please pull an issue or email at takiwu@connect.hku.hk

About

Official code for paper "Revisiting Model Interpolation for Efficient Reasoning"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages