About • Installation • How To Use • Credits • License
This repository contains an implemetation of HiFi-GAN with PyTorch.
Follow these steps to install the project:
-
(Optional) Create and activate new environment using
condaorvenv(+pyenv).a.
condaversion:# create env conda create -n project_env python=PYTHON_VERSION # activate env conda activate project_env
b.
venv(+pyenv) version:# create env ~/.pyenv/versions/PYTHON_VERSION/bin/python3 -m venv project_env # alternatively, using default python version python3 -m venv project_env # activate env source project_env
-
Install all required packages
pip install -r requirements.txt
-
Install
pre-commit:pre-commit install
To train a model, run the following command:
python3 train.py -cn=hifiganTo download pretrained model, use:
gdown https://drive.google.com/uc?id=1n9DVZznWy49nKiSljAbqdQvNiZ_VcPOaTo synthesize an audio from audio, your dataset should follow this structure:
NameOfTheDirectoryWithUtterances
└── transcriptions
├── UtteranceID1.wav
├── UtteranceID2.wav
.
.
.
└── UtteranceIDn.wavTo get predictions, run
python3 synthesize.py -cn=from_audio '+datasets.test.audio_dir=<PATH-TO-DIR>' 'inferencer.from_pretrained=<PATH-TO-PRETRAINED-MODEL>'To synthesize an audio from text, your dataset should follow this structure:
NameOfTheDirectoryWithUtterances
└── transcriptions
├── UtteranceID1.txt
├── UtteranceID2.txt
.
.
.
└── UtteranceIDn.txtTo get predictions, run
python3 synthesize.py -cn=from_text '+datasets.test.data_dir=<PATH-TO-DIR>' 'inferencer.from_pretrained=<PATH-TO-PRETRAINED-MODEL>'If you want to pass text from cli, run:
python3 synthesize.py -cn=from_cli '+datasets.test.index=[{text: "<YOUR-TEXT>", path: "text.txt", audio_len: 0}]' 'inferencer.from_pretrained=<PATH-TO-PRETRAINED-MODEL>'
Wandb report is available here.
This repository is based on a PyTorch Project Template.