Skip to content

fix(kokoro): misaki JA G2P phonemes silently dropped by Kokoro tokenizer #101

@crrow

Description

@crrow

Problem

When playing Japanese words via Kokoro TTS (e.g. kotoba play 今日), pronunciation is completely wrong.

Root cause: misaki JA G2P (pyopenjtalk) outputs Unicode palatalized consonant characters (, , , , , ) and ASCII g that are not in the Kokoro tokenizer vocabulary. These characters are silently dropped by tokenizer.tokenize(), causing critical phoneme loss.

Example: 今日 → phonemes ᶄoː^^_ → tokens [57, 158] (only "oː", the "ky" is lost)

Affected characters

misaki output meaning should map to
(U+1D84) palatalized k (きょ)
(U+1D83) palatalized g (ぎょ) ɡʲ
(U+1D80) palatalized b (びょ)
(U+1D86) palatalized m (みゃ)
(U+1D88) palatalized p (ぴょ)
(U+1D89) palatalized r (りょ)
g (U+0067) ASCII g ɡ (U+0261)
^, _, - prosody markers strip

Fix

Add phoneme normalization in KOKORO_TOKENIZER_SCRIPT between misaki G2P output and Kokoro tokenizer.

Metadata

Metadata

Assignees

No one assigned

    Labels

    agent:claudeWork done by Claude agentbugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions