Skip to content

Latest commit

 

History

History
7 lines (4 loc) · 248 Bytes

File metadata and controls

7 lines (4 loc) · 248 Bytes

grpo — AI Assistant Context

GRPO: Guided Reinforcement Policy Optimization for LLM Fine-tuning

A comprehensive guide and toolkit for fine-tuning language models using reinforcement learning techniques on the Hanzo AI platform.

Overview