RoSA: Enhancing Parameter-Efficient Fine-Tuning via RoPE-aware Selective Adaptation in Large Language Models

This repository contains the code for training and evaluating the RoSA model. It includes environment setup instructions, training and inference scripts, key configuration parameters, and dataset usage.

1. Environment Setup

To reproduce our results, we recommend setting up the environment using Conda.

Option 1: Using `rosa_environment.yml` (Recommended)

You can restore the full Conda environment in one step:

conda env create -f rosa_environment.yml
conda activate rosa

This YAML file includes exact versions of Python, PyTorch, DeepSpeed, PEFT, Transformers, and other necessary dependencies.

Option 2: Manual Setup via `requirements.txt`

Alternatively, you can create the environment manually:

conda create -n rosa python=3.10
conda activate rosa
pip install -r requirements.txt

The requirements.txt includes core Python packages required for RoSA training and evaluation.

Core Library Versions

This project has been tested with the following key dependencies:

torch==2.1.2
transformers==4.47.1
peft==0.16.0
deepspeed==0.12.6

CUDA and DeepSpeed Compatibility

DeepSpeed compiles CUDA extensions at runtime. Please ensure:

Your local CUDA Toolkit version matches the version used to compile your installed torch package.
The required CUDA runtime libraries and compiler are installed and available in your environment.

2. Training and Inference

Training and evaluation scripts are provided in the script/ directory:

run_rosa.sh: Script for model training
run_eval.sh: Script for model evaluation

Each script contains inline comments that help you modify critical training arguments such as learning rate, batch size, critical settings, etc.

Example usage:

bash script/run_rosa.sh
bash script/run_eval.sh

3. Key Model Parameters

Below are the custom arguments introduced by RoSA that control model behavior:

LOW_RATIO="0.25"         # Ratio of low-frequency attention dimensions
DYNA_RATIO="0.5"         # Proportion of layers to train
USE_LORA_GATE="yes"      # Use LoRA-style gated linear projections
LORA_DIM="128"           # Dimension of LoRA-based Linear Layer

4. Data

Training and test data are placed in the following directories:

data/
├── train/   # Training samples
└── test/    # Testing samples

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RoSA: Enhancing Parameter-Efficient Fine-Tuning via RoPE-aware Selective Adaptation in Large Language Models

1. Environment Setup

Option 1: Using `rosa_environment.yml` (Recommended)

Option 2: Manual Setup via `requirements.txt`

Core Library Versions

CUDA and DeepSpeed Compatibility

2. Training and Inference

3. Key Model Parameters

4. Data

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

RoSA: Enhancing Parameter-Efficient Fine-Tuning via RoPE-aware Selective Adaptation in Large Language Models

1. Environment Setup

Option 1: Using rosa_environment.yml (Recommended)

Option 2: Manual Setup via requirements.txt

Core Library Versions

CUDA and DeepSpeed Compatibility

2. Training and Inference

3. Key Model Parameters

4. Data

Option 1: Using `rosa_environment.yml` (Recommended)

Option 2: Manual Setup via `requirements.txt`