Skip to content

Conversation

@LiqunMa
Copy link
Collaborator

@LiqunMa LiqunMa commented Aug 14, 2025

Checklist Before Starting

  • Search for similar PR(s).

What does this PR do?

Add one-line overview of what this PR aims to achieve or accomplish.

Add reward functions for SynLogic dataset.

High-Level Design

Demonstrate the high-level design if this PR is complex.

None

Specific Changes

List the specific changes.

Add files at verl/utils/reward_score

API

Demonstrate how the API changes if any.

None

Usage Example

Provide usage example(s) for easier usage.

# bash scripts/train/example_singlenode_rl_qwen7b_synlogic.sh

Test

For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluatuion results, etc.

Additional Info.

  • Issue Number: Fixes issue # or discussion # if any. None
  • Training: [Note which backend this PR will affect: FSDP, Megatron, both, or none] None
  • Inference: [Note which backend this PR will affect: vLLM, SGLang, both, or none] None

Checklist Before Submitting

  • [Y] Read the Contribute Guide.
  • [Y] Apply pre-commit checks.
  • [Y] Add [BREAKING] to the PR title if it breaks any API.
  • [Y] Update the documentation about your changes in the docs.
  • [Y] New CI unit test(s) are added to cover the code path.
  • [Y] Rely on existing unit tests on CI that covers the code path.

@LiqunMa LiqunMa requested a review from haonan-li August 14, 2025 16:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants