Skip to content

答案解析逻辑对于旧的非推理模型(如llava-v1.5)不友好 #4

@wangsunyan777

Description

@wangsunyan777

模型不太会严格按照\boxed格式要求输出,导致解析结果为空,最后成绩异常低。
作者是否考虑让judge model直接按照原始输出评判正误?
图中给出的是llava-v1.5的结果文件
Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions