-
Notifications
You must be signed in to change notification settings - Fork 17
Open
Description
Hi, I ran the training script, and it took about 2.6k steps ending with an OOM error. (Is this normal by the way? )
And when I evaluated LCBv5, the results didn't match: 27.5 vs 29.5
I thought mismatched templates caused it, so I added configs to the LCB code:
.../LiveCodeBench/lcb_runner/prompts/code_generation.py:
SYSTEM_MESSAGE_CODER1QWEN = (
f"""<|im_start|>system\nYou are a helpful programming assistant. \
The user will ask you a question and you as the assistant solve it. \
The assistant first thinks how to solve the task through reasoning and then provides the user with the final answer. \
The reasoning process and answer are enclosed within <think>...</think> and <answer>...</answer> tags, respectively.<|im_end|>\n<|im_start|>user"""
)
# adapted from the `SYSTEM_PROMPT` in `.../code-r1/examples/data_preprocess/coder1.py` and `SYSTEM_MESSAGE_CODEQWEN` in `code_generation.py`
def get_coder1qwen_question_template_answer(question: CodeGenerationProblem):
prompt = "Please solve the programming task below using a self-contained code snippet in a markdown code block.\n\n"
prompt += f"Question: {question.question_content}\n\n"
if question.starter_code:
prompt += f"{PromptConstants.FORMATTING_MESSAGE_WITH_STARTER_CODE}\n"
prompt += f"```python\n{question.starter_code}\n```\n\n<|im_end|>\n"
else:
prompt += f"{PromptConstants.FORMATTING_WITHOUT_STARTER_CODE}\n"
prompt += f"```python\n# YOUR CODE HERE\n```\n\n<|im_end|>\n"
prompt += f"<|im_start|>assistant\n"
return prompt
# adapted from the get_codeqwen_question_template_answer function
and registered a new LM template in .../LiveCodeBench/lcb_runner/lm_styles.py
LanguageModel(
"ganler/CodeR1-Zero-Qwen2.5-7B-LC2k-1088",
"CodeR1QwenInst",
LMStyle.CodeR1QwenInst,
datetime(2025, 4, 21),
""
)
then ran
python -m lcb_runner.runner.main \
--model ganler/CodeR1-Zero-Qwen2.5-7B-LC2k-1088 \
--local_model_path ganler/CodeR1-Zero-Qwen2.5-7B-LC2k-1088 \
--scenario codegeneration \
--evaluate \
--tensor_parallel_size 1 \
--release_version release_v5
But it still ended up with 29.17 compared to 29.5.
XirenZhou and ChaofanTao
Metadata
Metadata
Assignees
Labels
No labels