Skip to content

hsjang0/CORE-PO

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 

Repository files navigation

Self-Training Large Language Models with Confident Reasoning

The official code base for reproducing self-training in Self-Training Large Language Models with Confident Reasoning. The training code involves (1) sampling math/science questions, (2) generating multiple answers with a LoRA-tuned Llama 3.1, (3) scoring them with an internal judge, and (4) running Direct Preference Optimization (DPO).

  • main.py
  • core_po/
    • arguments.py
    • __init__.py
    • data.py – loaders for GSM8K, ARC, MATH, and GPQA with rank-aware sharding.
    • generation.py – prompt templates and sampling.
    • judge.py – confidence scorer that evaluates reasoning and final answers.
    • models.py – loader of LoRA-adapted Llama 3.1 weights (8-bit or bf16).
    • trainer.py – CORE-PO DPO training loop

Quick Start for Training

The training requires four NVIDIA A100 GPUs.

accelerate launch --num_processes 4 main.py \
  --save_directory ./dpo_saved \
  --save_name ours_run \
  --learning_rate 5e-6 \
  --batch_size 4

BibTeX

@inproceedings{jang-etal-2025-self,
  title     = {Self-Training Large Language Models with Confident Reasoning},
  author    = {Jang, Hyosoon and Jang, Yunhui and Lee, Sungjae and Ok, Jungseul and Ahn, Sungsoo},
  editor    = {Christodoulopoulos, Christos and Chakraborty, Tanmoy and Rose, Carolyn and Peng, Violet},
  booktitle = {Findings of the Association for Computational Linguistics: EMNLP 2025},
  month     = {nov},
  year      = {2025},
  address   = {Suzhou, China},
  publisher = {Association for Computational Linguistics},
  url       = {https://aclanthology.org/2025.findings-emnlp.806/},
  doi       = {10.18653/v1/2025.findings-emnlp.806},
  pages     = {14925--14939},
  isbn      = {979-8-89176-335-7}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages