This project provides a simple script to compute alignment metrics for transformer models on various datasets. This the official code for "PLoP: Precise LoRA Placement for Efficient Finetuning of Large Models" (https://arxiv.org/abs/2506.20629).
Install dependencies:
pip install -r requirements.txtRun the main script:
python main.py --model <huggingface-model-handle> --dataset <math|code|history|logic> --batchsize <BATCHSIZE> --nbsamples <N> --seqlen <SEQ_LEN> --aggregation <type|layer|None> --output_dir <RESULTS_DIR>Example:
python main.py --model meta-llama/Llama-3.2-1B-Instruct --dataset math --batchsize 8 --nbsamples 100 --seqlen 256 --aggregation type --output_dir results/--model: HuggingFace model handle (e.g.,google/gemma-2b)--dataset: Dataset name (math,code,history,logic)--batchsize: Batch size (not used in this simple version, all samples are processed at once)--nbsamples: Number of samples to use from the dataset--seqlen: Sequence length for tokenization--aggregation: How to aggregate results (type,layer, orNone)--output_dir: Directory to save results
- Raw and aggregated metrics are saved as JSON files in the specified output directory.