Skip to content

GRPO Summary #3

@paulkroe

Description

@paulkroe

We need a clear and detailed explanation of GRPO.
This should be a markdown document in ./docs.
It should include the necessary math and be as intuitive as possible.
It will serve a implementation guideline and later ground or discussions of the results.

A great start is the following video:
https://www.youtube.com/watch?v=xT4jxQUl0X8
and later maybe even the paper:
https://arxiv.org/pdf/2402.03300

Metadata

Metadata

Assignees

No one assigned

    Labels

    documentationImprovements or additions to documentationgood first issueGood for newcomers

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions