[EMNLP 2025 Findings] O_O-VC: Synthetic Data-Driven One-to-One Alignment for Any-to-Any Voice Conversion

🛠 Setup & Dependencies

Clone the repository

git clone https://github.com/huutuongtu/OOVC
cd OOVC

Install Python dependencies

conda create -n oovc python==3.10.12
conda activate oovc
pip install torch==2.4.1 torchvision==0.19.1 torchaudio==2.4.1 --index-url https://download.pytorch.org/whl/cu121
pip install -r requirements.txt

🎧 Model Downloads

1️⃣ Download WavLM Model

Download WavLM-Large and place it under the wavlm/ directory.

2️⃣ Download Pretrained Generator Checkpoint

Download the pretrained checkpoint and place it under your logs folder (e.g., logs/oovc_w_f0/).

🚀 Inference

python convert.py \
  --source sample/8230-279154-0028.flac \
  --target sample/4970-29095-0008.flac \
  --checkpoint logs/oovc_w_f0/G_1470000.pth \
  --output sample/test_converted.wav

📂 Repository Structure

OOVC/
├── ...               
├── convert.py               # Main inference script
├── models_f0.py             # Generator model definition
├── mel_processing.py        # Mel spectrogram utilities
├── utils.py                 # Helper functions
├── wavlm/                   # WavLM model files
├── speaker_encoder/         # Speaker encoder files
├── configs/
│   └── freevc_f0.json       # Configuration file
├── sample/
│   ├── source_audio.flac
│   ├── target_audio.flac
│   └── test_converted.wav
└── logs/
    └── oovc_w_f0/
        └── G_1470000.pth

📘 Citation

If you use this code, please cite our paper:

@inproceedings{tu-2025_oovc,
  author    = {Huu Tuong Tu and Huan Vu and Cuong Tien Nguyen and Dien Hy Ngo and Nguyen Thi Thu Trang},
  title     = {O\_O-VC: Synthetic Data-Driven One-to-One Alignment for Any-to-Any Voice Conversion},
  booktitle = {Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2025)},
  year      = {2025},
}

🧠 Acknowledgements

This implementation builds upon FreeVC and VITS.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
configs		configs
logs/oovc_w_f0		logs/oovc_w_f0
sample		sample
speaker_encoder		speaker_encoder
test_set		test_set
vits		vits
wavlm		wavlm
.gitattributes		.gitattributes
.gitignore		.gitignore
Poster_EMNLP_VoiceConversion.pdf		Poster_EMNLP_VoiceConversion.pdf
README.md		README.md
commons.py		commons.py
convert.py		convert.py
data_utils_f0.py		data_utils_f0.py
functional.py		functional.py
losses.py		losses.py
mel_processing.py		mel_processing.py
models_f0.py		models_f0.py
modules.py		modules.py
preprocess_f0.py		preprocess_f0.py
preprocess_remove.py		preprocess_remove.py
preprocess_spk.py		preprocess_spk.py
preprocess_sr.py		preprocess_sr.py
preprocess_ssl.py		preprocess_ssl.py
pretrained.pt		pretrained.pt
requirements.txt		requirements.txt
train_f0.py		train_f0.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

[EMNLP 2025 Findings] O_O-VC: Synthetic Data-Driven One-to-One Alignment for Any-to-Any Voice Conversion

🛠 Setup & Dependencies

Clone the repository

Install Python dependencies

🎧 Model Downloads

1️⃣ Download WavLM Model

2️⃣ Download Pretrained Generator Checkpoint

🚀 Inference

📂 Repository Structure

📘 Citation

🧠 Acknowledgements

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

huutuongtu/OOVC

Folders and files

Latest commit

History

Repository files navigation

[EMNLP 2025 Findings] O_O-VC: Synthetic Data-Driven One-to-One Alignment for Any-to-Any Voice Conversion

🛠 Setup & Dependencies

Clone the repository

Install Python dependencies

🎧 Model Downloads

1️⃣ Download WavLM Model

2️⃣ Download Pretrained Generator Checkpoint

🚀 Inference

📂 Repository Structure

📘 Citation

🧠 Acknowledgements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages