Skip to content

Explore Models for Transcribing Intonation and Other Prosody Patterns (Speech2ToBI) #4

@SanderGi

Description

@SanderGi

The phonetic sequence (from Speech2IPA) is only one aspect of pronunciation. We also have tones and intonation, as well as different ways of accenting words using stress and/or pitch (see this blog). One attempt at a standardized notation for this is ToBI. Although just like IPA, the are variants (e.g., for Korean).

There are lots of different datasets and models for transcribing different aspects of this, e.g., English Lexical Stress (CNN, Transformer), English Intonation Mispronunciation, Pitch Accent Detection, Mandarin Pitch Accent, Prosodic Boundaries, Wav2ToBI, whether to combine or not combine with phoneme detection.

Would be great to list, compare, and evaluate a number of different approaches to assess where improvement is needed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions