A Persian grapheme-to-phoneme (G2P) model designed for homograph disambiguation, fine-tuned using the HomoRich dataset to improve pronunciation accuracy.
-
Updated
Oct 30, 2025 - Jupyter Notebook
A Persian grapheme-to-phoneme (G2P) model designed for homograph disambiguation, fine-tuned using the HomoRich dataset to improve pronunciation accuracy.
HomoRich: The first large-scale Persian homograph dataset for G2P conversion, featuring 528K annotated sentences with balanced pronunciation variants and dual phoneme representations.
Add a description, image, and links to the homorich topic page so that developers can more easily learn about it.
To associate your repository with the homorich topic, visit your repo's landing page and select "manage topics."