Skip to content

Add ReorderPhonemeInventoryJob#641

Merged
curufinwe merged 4 commits intomainfrom
match-lexicon-phoneme-inventories-job
Feb 9, 2026
Merged

Add ReorderPhonemeInventoryJob#641
curufinwe merged 4 commits intomainfrom
match-lexicon-phoneme-inventories-job

Conversation

@Icemole
Copy link
Collaborator

@Icemole Icemole commented Jan 21, 2026

Asserts that the units in the phonetic inventory of the ref_lex are equal to those of the lex_to_modify, and enforces the ordering in the phonetic lex_to_modify to be the exact same ordering as ref_lex.

Useful if a given pipeline has use cases in which the order of the phonetic units inside the lexicon matters.

Co-authored-by: michelwi <michelwi@users.noreply.github.com>
@Icemole Icemole requested a review from michelwi January 23, 2026 10:29
@Icemole Icemole requested a review from curufinwe January 26, 2026 12:25
@Icemole Icemole requested a review from sarahberanek February 3, 2026 07:37
@sarahberanek
Copy link
Collaborator

What a dangerous thing to do :-) I mean re-ordering the phonemes. But PR looks good to me. I am sure you tested it already.

@sarahberanek sarahberanek reopened this Feb 9, 2026
@Icemole
Copy link
Collaborator Author

Icemole commented Feb 9, 2026

What a dangerous thing to do :-) I mean re-ordering the phonemes.

Indeed... However, this job has a purpose. If we don't use any state-tying file, the NN softmax output labels are obtained from the phonetic inventory. As a consequence, we need to have a consistent order between training and recognition lexicon phonemes so that we can output the same labels in recognition as the ones we learned in training. And I'm certain that I have a training/recognition lexicon pair for which consistent phonetic ordering is not happening. I hope we don't have to use this job much, but just in case it's beneficial for anyone or for some other use case, I decided to push it here.

I am sure you tested it already.

Yes, I went through the phoneme set and it was ordered as expected!

@Icemole Icemole changed the title Add job to match two lexicon phoneme inventories Add ReorderPhonemeInventoryJob Feb 9, 2026
@curufinwe curufinwe merged commit 97bfb86 into main Feb 9, 2026
10 checks passed
@curufinwe curufinwe deleted the match-lexicon-phoneme-inventories-job branch February 9, 2026 13:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants