Skip to content

Conversation

@ccabcca06
Copy link

添加ccabcca06的初赛报告、决赛报告和代码

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @ccabcca06, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

此拉取请求旨在提交一个针对OpenSeek大模型挑战赛的完整解决方案,包括模型的训练代码、最终模型权重以及一套全面的评估工具链。这使得其他参与者或未来维护者能够复现训练过程、验证模型性能,并利用该框架进行进一步的开发和评估。

Highlights

  • 解决方案提交: 提交了ccabcca06用户在“超越杯Openseek大模型挑战赛”初赛和决赛阶段的报告及相关代码。
  • 训练流程: 提供了详细的两阶段PPO训练脚本(run_openseek_v1_ppo_step1.sh 和 run_openseek_v1_ppo_step2.sh),包括环境配置(Docker镜像、ModelScope模型下载)、数据集准备(gsm8k)以及关键参数设置。
  • 评估框架: 引入了一套完整的评估框架,包含用于数学表达式解析和判定的latex2sympy工具,以及支持多种数学数据集(如AMC23、Gaokao系列、MAWPS、Minerva Math、SAT Math等)的评估脚本和数据文件。
  • 模型推理与评估: 提供了模型推理和评估的指引,包括环境配置、prompt配置(针对GSM8K和AMC23数据集)以及运行评估脚本的步骤。
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

本次提交添加了“超越杯Openseek大模型挑战赛”的初赛报告、决赛报告和代码。代码部分包含了完整的训练和评估脚本,以及相关的数据和文档。整体工作很完整,文档清晰。

我在代码和文档中发现了一些可以改进的地方,主要集中在:

  • README.md 中的一些拼写错误和复制粘贴错误,可能会影响他人复现。
  • 训练脚本中使用了硬编码的绝对路径,这会影响脚本的可移植性。
  • 评估脚本中包含一些调试用的 print 语句,建议移除或使用日志库代替。

具体的修改建议请见各文件的审查评论。

# 配置actor、critic模型路径
actor_rollout_ref.model.path=/workspace/model/BAAI/OpenSeek-Small-v1-SFT # 竞赛起点模型
ritic.model.path=/workspace/models/deepseek-ai/DeepSeek-R1-0528-Qwen3-8B # deepseek r1 8b模型

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

参数 ritic.model.path 中存在拼写错误,应为 critic.model.path。这个错误会导致训练脚本因无法正确加载Critic模型而失败。

Suggested change
ritic.model.path=/workspace/models/deepseek-ai/DeepSeek-R1-0528-Qwen3-8B # deepseek r1 8b模型
critic.model.path=/workspace/models/deepseek-ai/DeepSeek-R1-0528-Qwen3-8B # deepseek r1 8b模型

```
其他数据集评估
```
bash run_eval_amc.sh ccabcca06/Openseek-small-V1-PPO-ccabcca06

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

在“其他数据集评估”部分,使用了 run_eval_amc.sh 脚本,这看起来是一个复制粘贴错误。应该使用一个通用的评估脚本(例如 run_eval.sh)或者针对不同数据集使用不同的脚本。当前这样会导致对其他数据集也使用 amc23 的专用prompt和配置,从而得到错误的评估结果。

Suggested change
bash run_eval_amc.sh ccabcca06/Openseek-small-V1-PPO-ccabcca06
bash run_eval.sh ccabcca06/Openseek-small-V1-PPO-ccabcca06

Comment on lines +7 to +15
data.train_files=/root/autodl-tmp/Openseek_RL/verl/data/gsm8k/train.parquet \
data.val_files=/root/autodl-tmp/Openseek_RL/verl/data/gsm8k/test.parquet \
data.train_batch_size=512 \
data.max_prompt_length=1024 \
data.max_response_length=512 \
data.filter_overlong_prompts=True \
data.truncation='error' \
data.trust_remote_code=True \
actor_rollout_ref.model.path=/root/autodl-tmp/Openseek_RL/Openseek-v1-PPO-300it \

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

脚本中的文件路径(如 data.train_files, data.val_files, actor_rollout_ref.model.path)被硬编码为绝对路径。这降低了脚本的可移植性,在其他环境中运行时需要手动修改。建议将这些路径定义为脚本顶部的变量,或者通过命令行参数传入,以方便配置和复用。此文件中的其他路径(如 L31)也存在同样的问题。

Comment on lines +7 to +15
data.train_files=/root/autodl-tmp/Openseek_RL/verl/data/gsm8k/train.parquet \
data.val_files=/root/autodl-tmp/Openseek_RL/verl/data/gsm8k/test.parquet \
data.train_batch_size=512 \
data.max_prompt_length=1024 \
data.max_response_length=512 \
data.filter_overlong_prompts=True \
data.truncation='error' \
data.trust_remote_code=True \
actor_rollout_ref.model.path=verl/checkpoints/verl_example/openseek_ppo_step1/actor/huggingface \

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

step1 脚本类似,此脚本中的文件路径(如 data.train_files, actor_rollout_ref.model.path)也被硬编码为绝对路径或与特定目录结构强相关的相对路径。为了提高脚本的可移植性和可维护性,建议使用环境变量或通过命令行参数传入路径。此文件中的其他路径(如 L31)也存在同样的问题。

docker pull docker.m.daocloud.io/verlai/verl:base-verl0.5-cu126-cudnn9.8-torch2.7.1-fa2.8.0
```

**step2.从Modelscpoe拉取训练模型**

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

此处存在拼写错误,Modelscpoe 应为 ModelScope。这个错误可能会导致用户在尝试根据文档设置环境时遇到问题。

Suggested change
**step2.从Modelscpoe拉取训练模型**
**step2.从ModelScope拉取训练模型**


```
cd verl/examples/data_preprocess
python gem8k.py

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

文件名 gem8k.py 似乎是一个拼写错误,根据上下文和数据集名称,应该是 gsm8k.py。这个错误会导致数据预处理步骤失败。

Suggested change
python gem8k.py
python gsm8k.py

```
关键参数设置:
```
## run_openseek_v1_ppo_step1.sh

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

此处的注释 run_openseek_v1_ppo_step1.sh 是一个复制粘贴错误,应该为 run_openseek_v1_ppo_step2.sh,以与第二阶段训练的上下文保持一致,避免混淆。

Suggested change
## run_openseek_v1_ppo_step1.sh
## run_openseek_v1_ppo_step2.sh

# We now fallback to semantic verification
for gt in ground_truth:
try:
print(parse(f"\\boxed{{{gt}}}", parsing_timeout=5))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

这是一个用于调试的 print 语句。在最终提交的代码中,建议移除这类语句或使用日志库(如 logging)代替,以保持输出的整洁。此文件中还有多处类似的 print 语句(例如 L129, L151, L191, L250)。

@ftgreat ftgreat changed the title 提交说明:添加ccabcca06的初赛报告、决赛报告和代码 [PZ COMPETITION] 添加ccabcca06的初赛报告、决赛报告和代码 Sep 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant