Skip to content

KickItLikeShika/llm-alignment

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 

Repository files navigation

LLM Alignment

This repository demonstrates practical implementation of language model alignment techniques, including Supervised Fine-Tuning (SFT) and Odds Ratio Preference Optimization (ORPO). The code shows how to align Llama-3.2-1B models to better follow instructions and align with human preferences.

For a detailed explanation of the methods, experiments, and findings, check out blog post.

Also, both models are open-sourced here:

  1. SFTLlama-3.2-1B: https://huggingface.co/KickItLikeShika/SFTLlama-3.2-1B
  2. ORPOLlama-3.2-1B: https://huggingface.co/KickItLikeShika/ORPOLlama-3.2-1B

About

LLM Alignment - SFT, RL, and ORPO

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages