Skip to content

Should old evaluation metrics (CEAFe, B3, MUC) be considered inappropiate? #17

@CaoHoangTung

Description

@CaoHoangTung

I've come accross the paper from ACL's website (https://www.aclweb.org/anthology/P16-1060/) which states that the traditional methods from conll2012 scripts are not so great methods to evaluate the coreference resolution task, and also introduce the LEA scorer (which has been implemented in this repository). However, the recent publications of this task are yet mainly evaluated by the old methods, and I can't see the reason why. Would be grateful for an appropriate answer.
Thanks :)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions