Skip to content

Joluck/MiSS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

54 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MiSS Logo

MiSS: Revisiting the Trade-off in LoRA with an Efficient Shard-Sharing Structure(ICLR'26)

arXiv License

A lightweight Parameter‑Efficient Fine‑Tuning (PEFT) technique that introduces Matrix Shard Sharing (MiSS) to balance adaptability and efficiency in large language models.


📌 Table of Contents

  1. 🚀 News
  2. 🔧 Installation
  3. ⚡ Quick Start
  4. 📊 Benchmarks & Results
  5. 📚 Citation

Note: MiSS is supported by Hugging Face PEFT and is actively being improved.

🚀 News

🎯 ICLR 2026 paper accepted on 2026‑01‑26!

Previous updates
  • 2025‑06‑13: Accepted at ES‑Fomo III workshop @ ICML 2025
  • 2025‑05‑16: Released MiSS paper version
  • 2024‑12‑31: Released DiSHA paper version
  • 2024‑11‑05: Integrated into Hugging Face PEFT repo
  • 2024‑09‑19: ArXiv release (Bone)
  • 2024‑08‑07: First proposed the Bone method

🔧 Installation

MiSS will eventually be available via pip install peft. For now:

# clone and install PEFT (editable mode)
git clone https://github.com/huggingface/peft.git
cd peft
pip install -e .

# grab this repository
git clone https://github.com/JL-er/MiSS.git
cd MiSS
sh scripts/run_miss.sh

⚡ Quick Start

from transformers import AutoModelForCausalLM
from peft import MissConfig, TaskType, get_peft_model
import torch

device = "cuda" if torch.cuda.is_available() else "cpu"
model = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen2.5-3B-Instruct", device_map=device
)

peft_config = MissConfig(r=16, task_type=TaskType.CAUSAL_LM)
model = get_peft_model(model, peft_config)
model.print_trainable_parameters()
# trainable: 3,686,400 / 3,089,625,088 (0.12%)

# training follows using Trainer or custom loop
model.save_pretrained("qwen2.5-3b-miss")

from space


📊 Benchmarks & Results

PEFT comparison

MiSS outperforms common LoRA variants while reducing memory and compute. See the paper for detailed numbers.

🔍 Block Affine Transformation (Bat)

Our experiments revealed that Bone's shard updates are collinear, limiting expressiveness. Bat uses pre-trained weights as nonlinear projectors to break this collinearity without extra parameters:

  1. Tensor factorization of $\mathbf{W}_0$ and $\mathbf{D}$.
  2. Affine transformation via tensor contraction.
  3. Reconstruction of the full update matrix.

Different reshaping strategies (Bat‑Row, Bat‑Col) offer flexible dimensional control. The full derivation is in the paper.


📚 Citation

@misc{kang2025missrevisitingtradeofflora,
  title={MiSS: Revisiting the Trade-off in LoRA with an Efficient Shard-Sharing Structure},
  author={Jiale Kang and Qingyu Yin},
  year={2025},
  eprint={2409.15371},
  archivePrefix={arXiv},
  primaryClass={cs.CL},
  url={https://arxiv.org/abs/2409.15371},
}

Thanks for checking out MiSS! Contributions and issues are welcome.

About

MiSS is a novel PEFT method that features a low-rank structure but introduces a new update mechanism distinct from LoRA, achieving an excellent balance between performance and efficiency.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors