Figure: The AR-MAP Framework. Transferring alignment from AR Teachers to Diffusion Students.
AR-MAP (Autoregressive Model Alignment for Diffusion) is a novel transfer learning framework that leverages preference-aligned Autoregressive LLMs (AR-LLMs) as implicit teachers for Diffusion LLMs (DLLMs). This repository contains the complete implementation including:
- Multi-aspect DPO training for helpfulness, truthfulness, and mathematical reasoning
- Comprehensive evaluation suite across multiple benchmarks
- Model merging utilities for LoRA adapters
- Support for multiple model architectures (Qwen, Dream, SDAR)
-
Multi-Aspect Optimization: Train models on multiple preference dimensions simultaneously
- Helpfulness alignment
- Truthfulness enhancement
- Mathematical reasoning improvement
-
Flexible Training Pipeline:
- DPO (Direct Preference Optimization) training
- LoRA fine-tuning support
- Multi-GPU distributed training
-
Comprehensive Evaluation:
- AlpacaEval for helpfulness
- TruthfulQA for truthfulness
- Arena-Hard for general capabilities
- Automated GPT-4 based evaluation
-
Model Support:
- Qwen 2.5 series
- Dream diffusion models
- SDAR models
- Easy extension to other architectures
conda create --name armap python=3.10
conda activate armap
pip install torch==2.6.0
pip install --no-cache-dir \
https://github.com/Dao-AILab/flash-attention/releases/download/v2.7.4.post1/\
flash_attn-2.7.4.post1+cu12torch2.6cxx11abiFALSE-cp310-cp310-linux_x86_64.whl
pip install -r requirements.txtUse LlamaFactory for DPO training:
cd LlamaFactory-mainAfter training, merge LoRA weights back to the base model:
# For Qwen models
python merge-lora-ar.py \
--base_model your_path/base_model \
--lora_adapter your_path/lora_adapter \
--output your_path/merged_model \
--weight 1.0
# For Dream models
python merge-lora-dream.py \
--base_model your_path/Dream-base \
--lora_adapter your_path/lora_adapter \
--output your_path/merged_model \
--weight 6.0
# For SDAR models
python merge-lora-sdar.py \
--base_model your_path/SDAR-base \
--lora_adapter your_path/lora_adapter \
--output your_path/merged_model \
--weight 6.0Evaluate your models on various benchmarks:
# Helpfulness evaluation (AlpacaEval)
cd eval-qwen
bash eval_helpful.sh
# Truthfulness evaluation
cd eval-qwen
python help_eval.py --model_name_or_path your_path/model
# Arena-Hard evaluation
cd eval-qwen
bash eval_arena.shAR-MAP/
βββ merge-lora-ar.py # LoRA merging for Qwen models
βββ merge-lora-dream.py # LoRA merging for Dream models
βββ merge-lora-sdar.py # LoRA merging for SDAR models
βββ eval-qwen/ # Evaluation scripts for Qwen
β βββ help_eval.py # Helpfulness evaluation
β βββ arena_qwen3.py # Arena-Hard evaluation
β βββ eval_*.sh # Evaluation bash scripts
βββ eval-dream/ # Evaluation scripts for Dream
β βββ dream-helpful.py # Helpfulness evaluation
β βββ dream-truthful.py # Truthfulness evaluation
β βββ dream/ # Dream model implementation
βββ eval-sdar/ # Evaluation scripts for SDAR
β βββ help_eval_sdar.py # Helpfulness evaluation
β βββ sdar_truthful.py # Truthfulness evaluation
β βββ ifeval_eval_sdar.py # IFEval benchmark
β βββ jetengine_ext/ # Optimized inference engine
βββ eval-dataset/ # Evaluation datasets
β βββ alpaca-*.jsonl # AlpacaEval datasets
β βββ arena-*.jsonl # Arena-Hard datasets
β βββ TruthfulQA.csv # TruthfulQA dataset
βββ train-dataset/ # Training datasets
β βββ dpo_helpful.json # Helpfulness preference data
β βββ dpo_math.json # Math preference data
β βββ dpo_truthful.json # Truthfulness preference data
βββ LlamaFactory-main/ # Training framework
βββ requirements.txt # Python dependencies
Update the following paths in the scripts to match your setup:
# In merge-lora-*.py
BASE_MODEL_PATH = "your_path/base_model"
LORA_PATH = "your_path/lora_adapter"
OUTPUT_PATH = "your_path/merged_model"
# In eval scripts
model_name_or_path = "your_path/model"
dataset_path = "your_path/dataset"For GPT-4 based evaluation, configure your API endpoint:
# In evaluation scripts
endpoint = "your_api_endpoint"
api_key = "your_api_key" # Keep this secure!
deployment_name = "your_deployment"Our framework evaluates models across multiple dimensions:
- Helpfulness: Measured via AlpacaEval with GPT-4 as judge
- Truthfulness: Evaluated on TruthfulQA benchmark
- Mathematical Reasoning: Tested on MATH and GSM8K datasets, Please note that we use the framework in TraceRL for evaluation.
- General Capabilities: Arena-Hard benchmark
- Instruction Following: IFEval benchmark
The training datasets are organized by aspect:
dpo_helpful.json: Preference pairs for helpfulnessdpo_math.json: Preference pairs for mathematical reasoningdpo_truthful.json: Preference pairs for truthfulness
Each dataset contains pairs of (chosen, rejected) responses for DPO training.
- QwenSeries: Standard autoregressive models
- Dream: Diffusion-based language models with block attention
- SDAR: Semi-autoregressive diffusion models
Different models require different merging coefficients:
- Qwen: Standard merging (weight=1.0)
- Dream/SDAR: Higher coefficients (weight=3.0) for better performance
Please refer to our paper (ARMAP_ARXIV.pdf) for detailed experimental results and analysis.
This work builds upon several excellent open-source projects:
- LlamaFactory for training infrastructure
- Dream for diffusion language models
- SDAR for semi-autoregressive models
- TraceRL for evaluation framework
If you find this work useful, please cite our paper.
This project is released under the MIT License. See LICENSE file for details.
For questions or issues, please open an issue on GitHub or contact the authors.