Voice

Experiments in voice conversion and speaker dimensions.

Demonstrate Voice Conversion driven by Acoustically Specified Targets

This demonstration shows how a voice conversion system can be trained to be driven by acoustic parameters or by principal components of those acoustic parameters.

In this demonstration, the FreeVC system is trained to perform voice conversion of speech audio using speaker embeddings computed by the Deep-Speaker system.

Deep Speaker was trained using a balanced set of 1000 speakers from the Globe corpus. FreeVC was trained using 5000 male and 5000 female speakers from the Globe corpus and Deep Speaker embeddings for those speakers.

Acoustic parameters were extracted for each of the 10,000 speakers and an MLP Regression model was used to predict the Deep Speaker embeddings from the acoustic parameters.

Finally principal components analysis of the acoustic parameters was performed to be used in the demonstration interface.

The diagram shows how the PCA components, acoustic parameters and speaker embeddings are used with FreeVC:

The user interface controls allow you to set the required acoustic parameters; either directly using sliders or indirectly using principal components:

Click on Go PCA or Go VQ to synthesize utterances using the PCA components or the raw parameters respectively. A few example audio files are provided and can be selected from the drop-down list.

Run Globe PCA Demonstration in COLAB using a GPU runtime.

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
FreeVC		FreeVC
images		images
Embedding_PCA_Demonstration.ipynb		Embedding_PCA_Demonstration.ipynb
Globe_PCA_Demonstration.ipynb		Globe_PCA_Demonstration.ipynb
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Voice

Demonstrate Voice Conversion driven by Acoustically Specified Targets

About

Uh oh!

Releases

Packages

Languages

License

mhuckvale/voice

Folders and files

Latest commit

History

Repository files navigation

Voice

Demonstrate Voice Conversion driven by Acoustically Specified Targets

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages