Skip to content

Mismatched results on VideoMMMU benchmark #14

@YanFangCS

Description

@YanFangCS

Hi, 作者您好,在VideoMMMU任务上测试复现onethinker-8B模型结果时测得结果比论文结果略低(64.3 < 66.4)。测试参数使用的是Evaluation/Eval/eval_bench_all.sh里提供的默认参数设置未修改。

Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions