SGDPO-ACL25 Source code for our paper: SGDPO: Self-Guided Direct Preference Optimization for Language Model Alignment https://arxiv.org/abs/2505.12435