Code for the paper: Prompt-OT: An Optimal Transport Regularization Paradigm for Knowledge Preservation in Vision-Language Model Adaptation

Abstract. Vision-language models (VLMs) such as CLIP demonstrate strong performance but struggle when adapted to downstream tasks. Prompt learning has emerged as an efficient and effective strategy to adapt VLMs while preserving their pre-trained knowledge. However, existing methods still lead to overfitting and degrade zero-shot generalization. To address this challenge, we propose an optimal transport (OT)-guided prompt learning framework that mitigates forgetting by preserving the structural consistency of feature distributions between pre-trained and fine-tuned models. Unlike conventional point-wise constraints, OT naturally captures cross-instance relationships and expands the feasible parameter space for prompt tuning, allowing a better trade-off between adaptation and generalization. Our approach enforces joint constraints on both vision and text representations, ensuring a holistic feature alignment. Extensive experiments on benchmark datasets demonstrate that our simple yet effective method can outperform existing prompt learning strategies in base-to-novel generalization, cross-dataset evaluation, and domain generalization without additional augmentation or ensemble techniques.

Configuration

Our code is built based on PromptSRC. Thanks for their code!

Environment Setup

Please refer to Install.md

Data Prepartion

Please refer to DATASETS.md

Just be cautious; in our code, we modify the data dir for imagenet.

Training and Evaluation

#!/bin/bash
DIV=10.0  #lambda in Eq. 16
MYCFG=vit_b16_c2_ep20_batch4_4+4ctx-Copy3
 #training
bash scripts/promptot/base2new_train.sh $dataset 1 $DIV $MYCFG
#evaluation
bash scripts/promptot/reproduce_base2novel_setting.sh $dataset 1 output${DIV}/base2new/train_base/$dataset/shots_16/PromptOT/${MYCFG} $MYCFG $DIV  
    
MYCFG=vit_b16_c2_ep20_batch4_4+4ctx-Copy3
bash scripts/promptot/base2new_train.sh $dataset 2 $DIV $MYCFG
bash scripts/promptot/reproduce_base2novel_setting.sh $dataset 2 output${DIV}/base2new/train_base/$dataset/shots_16/PromptOT/${MYCFG} $MYCFG $DIV
    
MYCFG=vit_b16_c2_ep20_batch4_4+4ctx-Copy3
bash scripts/promptot/base2new_train.sh $dataset 3 $DIV $MYCFG
bash scripts/promptot/reproduce_base2novel_setting.sh $dataset 3 output${DIV}/base2new/train_base/$dataset/shots_16/PromptOT/${MYCFG} $MYCFG $DIV

We save the weights for every epoch for further analysis.

Log process

python extract_log_3.py --save_patch output${DIV} --output_path output_csv

Run all dataset

sbatch my_script/run_all

Citation

If you find our work is useful in your research, please consider raising a star ⭐ and citing:

@article{chen2025prompt,
  title={Prompt-OT: An Optimal Transport Regularization Paradigm for Knowledge Preservation in Vision-Language Model Adaptation},
  author={Chen, Xiwen and Zhu, Wenhui and Qiu, Peijie and Wang, Hao and Li, Huayu and Wu, Haiyu and Sotiras, Aristeidis and Wang, Yalin and Razi, Abolfazl},
  journal={arXiv preprint arXiv:2503.08906},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
clip		clip
configs		configs
datasets		datasets
interpret_prompts		interpret_prompts
lpclip		lpclip
my_script		my_script
scripts		scripts
trainers		trainers
LICENSE		LICENSE
README.md		README.md
classnames.txt		classnames.txt
clip_words.csv		clip_words.csv
extract_log_3.py		extract_log_3.py
requirements.txt		requirements.txt
test_env.sh		test_env.sh
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Code for the paper: Prompt-OT: An Optimal Transport Regularization Paradigm for Knowledge Preservation in Vision-Language Model Adaptation

Configuration

Environment Setup

Data Prepartion

Training and Evaluation

Log process

Run all dataset

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

ChongQingNoSubway/Prompt-OT

Folders and files

Latest commit

History

Repository files navigation

Code for the paper: Prompt-OT: An Optimal Transport Regularization Paradigm for Knowledge Preservation in Vision-Language Model Adaptation

Configuration

Environment Setup

Data Prepartion

Training and Evaluation

Log process

Run all dataset

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages