Skip to content

compute reward报错,且停止不动 #49

@xiongxiaochu

Description

@xiongxiaochu

您好,十分感谢您的开源,但我在训练时遇到了一个问题。
在使用7b+8k的脚本训练时,训练过程中提示:
(main_task pid=2992) ERROR:2025-06-19 16:11:03,873:Error during comparison
(main_task pid=2992) Traceback (most recent call last):
(main_task pid=2992) File "/usr/local/miniconda3/lib/python3.10/site-packages/math_verify/grader.py", line 809, in compare_single_extraction_wrapper
(main_task pid=2992) return compare_single_extraction(g, t)
(main_task pid=2992) File "/usr/local/miniconda3/lib/python3.10/site-packages/math_verify/utils.py", line 51, in wrapper
(main_task pid=2992) return func(*args, **kwargs)
(main_task pid=2992) File "/usr/local/miniconda3/lib/python3.10/site-packages/math_verify/grader.py", line 789, in compare_single_extraction
(main_task pid=2992) return sympy_expr_eq(
(main_task pid=2992) File "/usr/local/miniconda3/lib/python3.10/site-packages/math_verify/grader.py", line 667, in sympy_expr_eq
(main_task pid=2992) return sympy_compare_relational(gold, pred, float_rounding, numeric_precision)
(main_task pid=2992) File "/usr/local/miniconda3/lib/python3.10/site-packages/math_verify/grader.py", line 344, in sympy_compare_relational
(main_task pid=2992) if sympy_solve_and_compare(gold, pred, float_rounding, numeric_precision):
(main_task pid=2992) File "/usr/local/miniconda3/lib/python3.10/site-packages/math_verify/grader.py", line 275, in sympy_solve_and_compare
(main_task pid=2992) solved_pred = list(ordered(solve(pred, pred.free_symbols)))
(main_task pid=2992) File "/usr/local/miniconda3/lib/python3.10/site-packages/sympy/solvers/solvers.py", line 1170, in solve
(main_task pid=2992) solution = _solve(f[0], *symbols, **flags)
(main_task pid=2992) File "/usr/local/miniconda3/lib/python3.10/site-packages/sympy/solvers/solvers.py", line 1729, in _solve
(main_task pid=2992) raise NotImplementedError('\n'.join([msg, not_impl_msg % f]))
(main_task pid=2992) NotImplementedError: multiple generators [m, Q(m), S(m)]
(main_task pid=2992) No algorithms are implemented to solve equation -(m + 1)*S(m) + Q(m)

且在训练约2小时后卡住

Image 请问是什么问题呢?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions