[PZ COMPETITION] 添加ccabcca06的初赛报告、决赛报告和代码 #170

ccabcca06 · 2025-09-12T13:20:55Z

添加ccabcca06的初赛报告、决赛报告和代码

gemini-code-assist

Summary of Changes

Hello @ccabcca06, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

此拉取请求旨在提交一个针对OpenSeek大模型挑战赛的完整解决方案，包括模型的训练代码、最终模型权重以及一套全面的评估工具链。这使得其他参与者或未来维护者能够复现训练过程、验证模型性能，并利用该框架进行进一步的开发和评估。

Highlights

解决方案提交: 提交了ccabcca06用户在“超越杯Openseek大模型挑战赛”初赛和决赛阶段的报告及相关代码。
训练流程: 提供了详细的两阶段PPO训练脚本（run_openseek_v1_ppo_step1.sh 和 run_openseek_v1_ppo_step2.sh），包括环境配置（Docker镜像、ModelScope模型下载）、数据集准备（gsm8k）以及关键参数设置。
评估框架: 引入了一套完整的评估框架，包含用于数学表达式解析和判定的latex2sympy工具，以及支持多种数学数据集（如AMC23、Gaokao系列、MAWPS、Minerva Math、SAT Math等）的评估脚本和数据文件。
模型推理与评估: 提供了模型推理和评估的指引，包括环境配置、prompt配置（针对GSM8K和AMC23数据集）以及运行评估脚本的步骤。

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

本次提交添加了“超越杯Openseek大模型挑战赛”的初赛报告、决赛报告和代码。代码部分包含了完整的训练和评估脚本，以及相关的数据和文档。整体工作很完整，文档清晰。

我在代码和文档中发现了一些可以改进的地方，主要集中在：

README.md 中的一些拼写错误和复制粘贴错误，可能会影响他人复现。
训练脚本中使用了硬编码的绝对路径，这会影响脚本的可移植性。
评估脚本中包含一些调试用的 print 语句，建议移除或使用日志库代替。

具体的修改建议请见各文件的审查评论。

gemini-code-assist · 2025-09-12T13:24:48Z

openseek/competition/pz/ccabcca06/README.md

+
+# 配置actor、critic模型路径
+actor_rollout_ref.model.path=/workspace/model/BAAI/OpenSeek-Small-v1-SFT  # 竞赛起点模型
+ritic.model.path=/workspace/models/deepseek-ai/DeepSeek-R1-0528-Qwen3-8B # deepseek r1 8b模型


参数 ritic.model.path 中存在拼写错误，应为 critic.model.path。这个错误会导致训练脚本因无法正确加载Critic模型而失败。

Suggested change

ritic.model.path=/workspace/models/deepseek-ai/DeepSeek-R1-0528-Qwen3-8B # deepseek r1 8b模型

critic.model.path=/workspace/models/deepseek-ai/DeepSeek-R1-0528-Qwen3-8B # deepseek r1 8b模型

gemini-code-assist · 2025-09-12T13:24:49Z

openseek/competition/pz/ccabcca06/README.md

+```
+其他数据集评估
+```
+bash run_eval_amc.sh ccabcca06/Openseek-small-V1-PPO-ccabcca06


在“其他数据集评估”部分，使用了 run_eval_amc.sh 脚本，这看起来是一个复制粘贴错误。应该使用一个通用的评估脚本（例如 run_eval.sh）或者针对不同数据集使用不同的脚本。当前这样会导致对其他数据集也使用 amc23 的专用prompt和配置，从而得到错误的评估结果。

Suggested change

bash run_eval_amc.sh ccabcca06/Openseek-small-V1-PPO-ccabcca06

bash run_eval.sh ccabcca06/Openseek-small-V1-PPO-ccabcca06

gemini-code-assist · 2025-09-12T13:24:49Z

openseek/competition/pz/ccabcca06/source_code/train/run_openseek_v1_ppo_step1.sh

+    data.train_files=/root/autodl-tmp/Openseek_RL/verl/data/gsm8k/train.parquet \
+    data.val_files=/root/autodl-tmp/Openseek_RL/verl/data/gsm8k/test.parquet \
+    data.train_batch_size=512 \
+    data.max_prompt_length=1024 \
+    data.max_response_length=512 \
+    data.filter_overlong_prompts=True \
+    data.truncation='error' \
+    data.trust_remote_code=True \
+    actor_rollout_ref.model.path=/root/autodl-tmp/Openseek_RL/Openseek-v1-PPO-300it \


脚本中的文件路径（如 data.train_files, data.val_files, actor_rollout_ref.model.path）被硬编码为绝对路径。这降低了脚本的可移植性，在其他环境中运行时需要手动修改。建议将这些路径定义为脚本顶部的变量，或者通过命令行参数传入，以方便配置和复用。此文件中的其他路径（如 L31）也存在同样的问题。

gemini-code-assist · 2025-09-12T13:24:49Z

openseek/competition/pz/ccabcca06/source_code/train/run_openseek_v1_ppo_step2.sh

+    data.train_files=/root/autodl-tmp/Openseek_RL/verl/data/gsm8k/train.parquet \
+    data.val_files=/root/autodl-tmp/Openseek_RL/verl/data/gsm8k/test.parquet \
+    data.train_batch_size=512 \
+    data.max_prompt_length=1024 \
+    data.max_response_length=512 \
+    data.filter_overlong_prompts=True \
+    data.truncation='error' \
+    data.trust_remote_code=True \
+    actor_rollout_ref.model.path=verl/checkpoints/verl_example/openseek_ppo_step1/actor/huggingface  \


与 step1 脚本类似，此脚本中的文件路径（如 data.train_files, actor_rollout_ref.model.path）也被硬编码为绝对路径或与特定目录结构强相关的相对路径。为了提高脚本的可移植性和可维护性，建议使用环境变量或通过命令行参数传入路径。此文件中的其他路径（如 L31）也存在同样的问题。

gemini-code-assist · 2025-09-12T13:24:49Z

openseek/competition/pz/ccabcca06/README.md

+docker pull docker.m.daocloud.io/verlai/verl:base-verl0.5-cu126-cudnn9.8-torch2.7.1-fa2.8.0
+```
+
+**step2.从Modelscpoe拉取训练模型**


此处存在拼写错误，Modelscpoe 应为 ModelScope。这个错误可能会导致用户在尝试根据文档设置环境时遇到问题。

Suggested change

**step2.从Modelscpoe拉取训练模型**

**step2.从ModelScope拉取训练模型**

gemini-code-assist · 2025-09-12T13:24:49Z

openseek/competition/pz/ccabcca06/README.md

+
+```
+cd verl/examples/data_preprocess
+python gem8k.py


文件名 gem8k.py 似乎是一个拼写错误，根据上下文和数据集名称，应该是 gsm8k.py。这个错误会导致数据预处理步骤失败。

Suggested change

python gem8k.py

python gsm8k.py

gemini-code-assist · 2025-09-12T13:24:49Z

openseek/competition/pz/ccabcca06/README.md

+```
+关键参数设置：
+```
+## run_openseek_v1_ppo_step1.sh


此处的注释 run_openseek_v1_ppo_step1.sh 是一个复制粘贴错误，应该为 run_openseek_v1_ppo_step2.sh，以与第二阶段训练的上下文保持一致，避免混淆。

Suggested change

## run_openseek_v1_ppo_step1.sh

## run_openseek_v1_ppo_step2.sh

gemini-code-assist · 2025-09-12T13:24:49Z

openseek/competition/pz/ccabcca06/source_code/evaluate/qwen_eval/evaluate_final.py

+        # We now fallback to semantic verification
+        for gt in ground_truth:
+            try:
+                print(parse(f"\\boxed{{{gt}}}", parsing_timeout=5))


这是一个用于调试的 print 语句。在最终提交的代码中，建议移除这类语句或使用日志库（如 logging）代替，以保持输出的整洁。此文件中还有多处类似的 print 语句（例如 L129, L151, L191, L250）。

提交说明：添加ccabcca06的初赛报告、决赛报告和代码

43627d2

gemini-code-assist bot reviewed Sep 12, 2025

View reviewed changes

ftgreat changed the title ~~提交说明：添加ccabcca06的初赛报告、决赛报告和代码~~ [PZ COMPETITION] 添加ccabcca06的初赛报告、决赛报告和代码 Sep 13, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[PZ COMPETITION] 添加ccabcca06的初赛报告、决赛报告和代码 #170

[PZ COMPETITION] 添加ccabcca06的初赛报告、决赛报告和代码 #170

Uh oh!

ccabcca06 commented Sep 12, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Sep 12, 2025

Uh oh!

gemini-code-assist bot Sep 12, 2025

Uh oh!

gemini-code-assist bot Sep 12, 2025

Uh oh!

gemini-code-assist bot Sep 12, 2025

Uh oh!

gemini-code-assist bot Sep 12, 2025

Uh oh!

gemini-code-assist bot Sep 12, 2025

Uh oh!

gemini-code-assist bot Sep 12, 2025

Uh oh!

gemini-code-assist bot Sep 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	ritic.model.path=/workspace/models/deepseek-ai/DeepSeek-R1-0528-Qwen3-8B # deepseek r1 8b模型
	critic.model.path=/workspace/models/deepseek-ai/DeepSeek-R1-0528-Qwen3-8B # deepseek r1 8b模型

	bash run_eval_amc.sh ccabcca06/Openseek-small-V1-PPO-ccabcca06
	bash run_eval.sh ccabcca06/Openseek-small-V1-PPO-ccabcca06

	step2.从Modelscpoe拉取训练模型
	step2.从ModelScope拉取训练模型

	## run_openseek_v1_ppo_step1.sh
	## run_openseek_v1_ppo_step2.sh

[PZ COMPETITION] 添加ccabcca06的初赛报告、决赛报告和代码 #170

Are you sure you want to change the base?

[PZ COMPETITION] 添加ccabcca06的初赛报告、决赛报告和代码 #170

Uh oh!

Conversation

ccabcca06 commented Sep 12, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Sep 12, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Sep 12, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Sep 12, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Sep 12, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Sep 12, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Sep 12, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Sep 12, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Sep 12, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant