-
Notifications
You must be signed in to change notification settings - Fork 39
[PZ COMPETITION] 添加ccabcca06的初赛报告、决赛报告和代码 #170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary of Changes
Hello @ccabcca06, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
此拉取请求旨在提交一个针对OpenSeek大模型挑战赛的完整解决方案,包括模型的训练代码、最终模型权重以及一套全面的评估工具链。这使得其他参与者或未来维护者能够复现训练过程、验证模型性能,并利用该框架进行进一步的开发和评估。
Highlights
- 解决方案提交: 提交了ccabcca06用户在“超越杯Openseek大模型挑战赛”初赛和决赛阶段的报告及相关代码。
- 训练流程: 提供了详细的两阶段PPO训练脚本(run_openseek_v1_ppo_step1.sh 和 run_openseek_v1_ppo_step2.sh),包括环境配置(Docker镜像、ModelScope模型下载)、数据集准备(gsm8k)以及关键参数设置。
- 评估框架: 引入了一套完整的评估框架,包含用于数学表达式解析和判定的latex2sympy工具,以及支持多种数学数据集(如AMC23、Gaokao系列、MAWPS、Minerva Math、SAT Math等)的评估脚本和数据文件。
- 模型推理与评估: 提供了模型推理和评估的指引,包括环境配置、prompt配置(针对GSM8K和AMC23数据集)以及运行评估脚本的步骤。
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.
| Feature | Command | Description |
|---|---|---|
| Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
| Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
| Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
| Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
本次提交添加了“超越杯Openseek大模型挑战赛”的初赛报告、决赛报告和代码。代码部分包含了完整的训练和评估脚本,以及相关的数据和文档。整体工作很完整,文档清晰。
我在代码和文档中发现了一些可以改进的地方,主要集中在:
README.md中的一些拼写错误和复制粘贴错误,可能会影响他人复现。- 训练脚本中使用了硬编码的绝对路径,这会影响脚本的可移植性。
- 评估脚本中包含一些调试用的
print语句,建议移除或使用日志库代替。
具体的修改建议请见各文件的审查评论。
| # 配置actor、critic模型路径 | ||
| actor_rollout_ref.model.path=/workspace/model/BAAI/OpenSeek-Small-v1-SFT # 竞赛起点模型 | ||
| ritic.model.path=/workspace/models/deepseek-ai/DeepSeek-R1-0528-Qwen3-8B # deepseek r1 8b模型 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| ``` | ||
| 其他数据集评估 | ||
| ``` | ||
| bash run_eval_amc.sh ccabcca06/Openseek-small-V1-PPO-ccabcca06 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| data.train_files=/root/autodl-tmp/Openseek_RL/verl/data/gsm8k/train.parquet \ | ||
| data.val_files=/root/autodl-tmp/Openseek_RL/verl/data/gsm8k/test.parquet \ | ||
| data.train_batch_size=512 \ | ||
| data.max_prompt_length=1024 \ | ||
| data.max_response_length=512 \ | ||
| data.filter_overlong_prompts=True \ | ||
| data.truncation='error' \ | ||
| data.trust_remote_code=True \ | ||
| actor_rollout_ref.model.path=/root/autodl-tmp/Openseek_RL/Openseek-v1-PPO-300it \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| data.train_files=/root/autodl-tmp/Openseek_RL/verl/data/gsm8k/train.parquet \ | ||
| data.val_files=/root/autodl-tmp/Openseek_RL/verl/data/gsm8k/test.parquet \ | ||
| data.train_batch_size=512 \ | ||
| data.max_prompt_length=1024 \ | ||
| data.max_response_length=512 \ | ||
| data.filter_overlong_prompts=True \ | ||
| data.truncation='error' \ | ||
| data.trust_remote_code=True \ | ||
| actor_rollout_ref.model.path=verl/checkpoints/verl_example/openseek_ppo_step1/actor/huggingface \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| docker pull docker.m.daocloud.io/verlai/verl:base-verl0.5-cu126-cudnn9.8-torch2.7.1-fa2.8.0 | ||
| ``` | ||
|
|
||
| **step2.从Modelscpoe拉取训练模型** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
|
||
| ``` | ||
| cd verl/examples/data_preprocess | ||
| python gem8k.py |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| ``` | ||
| 关键参数设置: | ||
| ``` | ||
| ## run_openseek_v1_ppo_step1.sh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| # We now fallback to semantic verification | ||
| for gt in ground_truth: | ||
| try: | ||
| print(parse(f"\\boxed{{{gt}}}", parsing_timeout=5)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
添加ccabcca06的初赛报告、决赛报告和代码