Skip to content

Problems about Audio-Reasoner-CoTA Dataset #20

@Radiant0726

Description

@Radiant0726

Thanks for releasing this dataset(https://huggingface.co/datasets/zhifeixie/Audio-Reasoner-CoTA). I encountered some issues while using it:

  1. QA mismatch in complex_audio subset:
    There appear to be noticeable mismatches between the questions and answers in the complex_audio subset. In several cases, the provided answers do not seem to correspond accurately to the audio content or the questions themselves.
  2. Missing source dataset identifiers:
    Would it be possible to provide the original source dataset identifiers (e.g., YouTube IDs, file names, or other unique references) for all samples in the dataset? This would greatly help with data organization for users.
  3. Corrupted archive in audiocap subset:
    The compressed file for the audiocap subset appears to be corrupted or broken and I’m unable to extract its contents. Could you please check and re-upload the file?

Thanks again for your work, and looking forward to your response!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions