GMM classifier with multiple mixture components per accent

This setup is a bit different from our previous experiment where we had a single GMM with each mixture component corresponding to an accent.

Instead, we'll build distinct GMM models for each accent. Rather than taking the average of the time frames for each speaker, keep all the frames.

Each GMM component is meant to (very approximately) represent a phone. Of course, we don't know what the phones are and which time slice is which phone -- the trick is to try to figure this out automatically with EM. **Before Thursday, skim chapter 9 (9.1 and 9.2) in the Bishop PRML textbook to learn about clustering and EM.**

We're not going to use the .predict() method of the GMM class at all, since that only tells use which component is the best fit. We don't care about this, since in our new setup, the components are the phones, and the GMMs as a whole are the accents.

Instead, when it comes to testing, compute the log probability (remember Naive Bayes?) of each frame of the test sample under each of the GMM models. The winning model is the one with the greatest overall likelihood across all frames.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

GMM classifier with multiple mixture components per accent #3

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

GMM classifier with multiple mixture components per accent #3

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions