A repo of R1-like reproductions

DeepSeek-R1

Official repo by Deepseek-ai

open-r1

Reproduction of R1 by huggingface

TinyZero

Reprodction of R1-Zero by Jiayi-Pan

R1-V

R1 in VLMs.

Item counting & GeoQA.

simpleRL-reason

Reproduction of R1-Zero and R1 by small models and limited data.

Logic-RL

Reproduction of R1-Zero.

Logic puzzle

Multimodal Open R1

R1 paradigm in multimodal model.

Math reasoning (data created by gpt-4o, based on Math360k & Geo170k)

VLM-R1

R1-style LVLM. Referring Expression Comprehension(REC).

RefCOCO for in domain and RefGTA for OOD.

REGEN

R1(-Zero) methods for training agentic models.

LMM-R1

Reproduction of R1 on multimodal setting. Based on OpenRLHF. Currently support PPO/REINFORCE++/RLOO training for LMM.

MATH dataset.

Open Reasoner Zero

Reproduction of Deepseek R1-Zero

57k curated training data

Easy R1

Based on veRL, supporting VLM RL

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A repo of R1-like reproductions

DeepSeek-R1

open-r1

TinyZero

R1-V

simpleRL-reason

Logic-RL

Multimodal Open R1

VLM-R1

REGEN

LMM-R1

Open Reasoner Zero

Easy R1

About

Uh oh!

Releases

Packages

kxfan2002/R1-Collection

Folders and files

Latest commit

History

Repository files navigation

A repo of R1-like reproductions

About

Resources

Uh oh!

Stars

Watchers

Forks