Problem
When playing Japanese words via Kokoro TTS (e.g. kotoba play 今日), pronunciation is completely wrong.
Root cause: misaki JA G2P (pyopenjtalk) outputs Unicode palatalized consonant characters (ᶄ, ᶃ, ᶀ, ᶆ, ᶈ, ᶉ) and ASCII g that are not in the Kokoro tokenizer vocabulary. These characters are silently dropped by tokenizer.tokenize(), causing critical phoneme loss.
Example: 今日 → phonemes ᶄoː^^_ → tokens [57, 158] (only "oː", the "ky" is lost)
Affected characters
| misaki output |
meaning |
should map to |
ᶄ (U+1D84) |
palatalized k (きょ) |
kʲ |
ᶃ (U+1D83) |
palatalized g (ぎょ) |
ɡʲ |
ᶀ (U+1D80) |
palatalized b (びょ) |
bʲ |
ᶆ (U+1D86) |
palatalized m (みゃ) |
mʲ |
ᶈ (U+1D88) |
palatalized p (ぴょ) |
pʲ |
ᶉ (U+1D89) |
palatalized r (りょ) |
rʲ |
g (U+0067) |
ASCII g |
ɡ (U+0261) |
^, _, - |
prosody markers |
strip |
Fix
Add phoneme normalization in KOKORO_TOKENIZER_SCRIPT between misaki G2P output and Kokoro tokenizer.
Problem
When playing Japanese words via Kokoro TTS (e.g.
kotoba play 今日), pronunciation is completely wrong.Root cause: misaki JA G2P (pyopenjtalk) outputs Unicode palatalized consonant characters (
ᶄ,ᶃ,ᶀ,ᶆ,ᶈ,ᶉ) and ASCIIgthat are not in the Kokoro tokenizer vocabulary. These characters are silently dropped bytokenizer.tokenize(), causing critical phoneme loss.Example:
今日→ phonemesᶄoː^^_→ tokens[57, 158](only "oː", the "ky" is lost)Affected characters
ᶄ(U+1D84)kʲᶃ(U+1D83)ɡʲᶀ(U+1D80)bʲᶆ(U+1D86)mʲᶈ(U+1D88)pʲᶉ(U+1D89)rʲg(U+0067)ɡ(U+0261)^,_,-Fix
Add phoneme normalization in
KOKORO_TOKENIZER_SCRIPTbetween misaki G2P output and Kokoro tokenizer.