Skip to content

ChengShiest/Vision-Function-Layer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

12 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Vision-Function-Layer in Multimodal LLMs

tea

⚠️ The huggingface package version should be exactly as 4.50.0 or you should modify the vision token swapping code based on your own version.

Vision Token Dropping

This repository contains the implementation of Vision Token Dropping.
For detailed explanation and code, please refer to the Vision-Token-Dropping folder.


πŸš€ Experiments

All experiments are conducted under the VFL-LoRA setup.
Please check out our VFL-LoRA for the base code and environment setup.


βœ… TODO List

  • Training data for VFL-LoRA
  • [βœ…] Open-Source Code
  • [βœ…] Publish arXiv Paper

Citation

If you find this work useful, please cite our paper:

@article{shi2025vision,
  title={Vision Function Layer in Multimodal LLMs},
  author={Shi, Cheng and Yu, Yizhou and Yang, Sibei},
  journal={arXiv preprint arXiv:2509.24791},
  year={2025}
}

About

[NeurIPS 2025] The official PyTorch implementation of the "Vision Function Layer in MLLM".

Resources

Stars

Watchers

Forks

Packages

No packages published