Skip to content

Ju-usc/voice_chimera

Repository files navigation

Voice Chimera

Voice Chimera is a research project focused on redefining voice identity through innovative synthesis techniques. This repository contains the data, research paper, and Jupyter notebook related to our experiments on voice blending using diverse methods like Linear Interpolation, Spherical Linear Interpolation, Genetic Algorithm-inspired approaches, and Principal Component Analysis.

Project Structure

This repository includes the following files:

  • Voice Chimera Experiment.xlsx - Contains the experimental data and results from our voice blending methods.
  • Voice Chimera: Redefining Voice Identity Through Diverse Synthesis Techniques.pdf - A detailed research paper explaining the methodologies, results, and implications of our experiments.
  • Voice_Chimera_Research.ipynb - A Jupyter notebook that includes the code for our experiments, data analysis, and visualization of results.

Introduction

The Voice Chimera project explores the creation of unique vocal identities by blending characteristics from multiple speakers. Utilizing advanced machine learning models like the VITS (Variational Inference with adversarial learning for end-to-end Text-to-Speech), this research investigates four voice blending techniques to generate natural-sounding, diverse synthetic voices. These techniques include:

  • Linear Interpolation (LERP)
  • Spherical Linear Interpolation (SLERP)
  • Genetic Algorithm-Inspired Method
  • Principal Component Analysis (PCA)

The objective is to enhance the naturalness and diversity of synthetic voices, which could have significant applications in entertainment, assistive technologies, and personalized AI interfaces.

Usage

To run the Jupyter notebook:

  1. Ensure you have Python installed, along with Jupyter.
  2. Install the necessary libraries with pip install numpy matplotlib scikit-learn torch.
  3. Launch Jupyter Notebook in the directory containing Voice_Chimera_Research.ipynb and run the cells sequentially.

Contact

For any inquiries, please reach out to Ju Lee, Viterbi School of Engineering, University of Southern California.

Acknowledgments

This research is supported by insights and frameworks developed by various researchers cited in this paper. Special thanks to the developers of the VITS model and all participants in our voice synthesis studies.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors