[Bug] Misalignment between model outputs and ground truth in exported Excel (convert_result_to_excel)

When using `convert_result_to_excel` to export results, the question, expected answer, model output, and error_reason columns are misaligned in the Excel file. This makes it look like the model output and ground truth are unrelated.

The root cause is that the code aligns error info by line index instead of by sample `id`. The score file (`*_score.json`) stores only the summary on the first line and then only wrong samples afterward, so `(line index → dataset index)` is not a valid mapping. As a result, `flag` and `error_reason` are written to the wrong rows.

The evaluation metrics (accuracy, correct_count, total_count) are correct; the bug only affects the Excel visualization. A proper fix would be to build a mapping from `id` to row index in `prompt_list`, and then, when reading `score_file`, use `data["id"]` to locate the correct row before setting `flag` and `error_reason`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug] Misalignment between model outputs and ground truth in exported Excel (convert_result_to_excel) #22

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[Bug] Misalignment between model outputs and ground truth in exported Excel (convert_result_to_excel) #22

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions