This repository provides the official implementation of Frank-Wolfe Merging (FW-Merging), a method that frames large-scale model merging as a constrained optimization problem. Fine-tuned checkpoints define the constraint set, while the objective dictates the desired properties of the merged model.
Inspired by Frank-Wolfe optimization, FW-Merging contains three key stages:
- Relevance Evaluation – Uses gradient-based linear approximation to identify the most beneficial merging direction.
- Model Selection – Selects checkpoints that minimize interference while preserving task-specific knowledge.
- Knowledge Integration – Merges the selected checkpoint using an orthogonal method, balancing adaptation and stability.
FW-Merging is designed to satisfy two fundamental scaling properties of model merging: (1) Irrelevance Robustness – Adding irrelevant models to the merging pool should not degrade performance. (2) Relevance Utilization – Adding relevant models should steadily improve performance, converging toward the optimal outcome.
Our experiments show that FW-Merging remains stable with 16 irrelevant models, improves by 15.3% with 16 relevant models on 20 CV tasks, and maintains constant memory overhead—unlike data-informed methods with linear overhead. It outperforms data-free merging by 32.8% and data-driven merging by 8.39% when merging 20 ViT models.
@misc{chen2025fwmergingscalingmodelmerging,
title={FW-Merging: Scaling Model Merging with Frank-Wolfe Optimization},
author={Hao Mark Chen and Shell Xu Hu and Wayne Luk and Timothy Hospedales and Hongxiang Fan},
year={2025},
eprint={2503.12649},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2503.12649},
}
- Installation
- Merging for Discriminative Models
- Merging for Generative Models
- Merging for Vision Models
- Acknowledgments
Create conda env:
conda env create -n FW-merging
After activating the conda environment, install the requirements (might need to install torch==2.0.1 first)
pip install -r requirements
Then install the rest of dependency from yml file
conda env update -f env.yml
The merged model checkpoint can be found at here.
Download the 4 Roberta models as specified in Twin Merging repo.
huggingface-cli download lu-vae/roberta-glue --local-dir roberta
Then run the scripts
cd discriminative; source scripts
To run model merging with FW and evlauation, run the following command in terminal
run_frank_wolfe
The results will be saved under outs directory.
The merged model checkpoint can be found at here.
Download the 16 LLaMA2-7B models as specified in generative/llama.json under llama folder.
Then run the scripts
cd generative; source scripts
To run model merging with FW and evlauation, run the following command in terminal
run_frank_wolfe
The models will be saved under outs directory.
The models can then be evlauated using third-party packages like lm-eval-harness.
Please refer to this repo for the implementation of FW Soft and FW Hard on ViT benchmarks.
The codebase is adapted from Twin Merging. We thank the authors for their wonderful work.



