Skip to content

krokode/Audio_SR

Repository files navigation

About

A PyTorch reproduction of the Temporal FiLM (Birnbaum, S., et al., 2019 [NeurIPS]) super-resolution method.

Environment

Setup

Local

Clone the project from the desired parent directory on the local device:

git clone https://github.com/krokode/Audio_SR.git
cd Audio_SR

Setup the software environment:

Linux

. setup.sh
. activate.sh

Windows

setup.ps1
activate.ps1

VSC Supercomputer

Clone the project onto the data partition:

cd $VSC_DATA
git clone https://github.com/krokode/Audio_SR.git
cd Audio_SR

Setup the software environment:

. vsc_setup_wice.sh
. vsc_activate_wise.sh

Datasets

Download

Local

cd data/vctk
python arc_load_unpack.py

VSC Supercomputer

Upload data with WinSCP to the $VSC_DATA/Audio_SR/data/vctk partition.

Unpack the data acrhive:

cd $VSC_DATA/Audio_SR/data/vctk
tar -xvf VCTK-Corpus.tar.gz

Preprocess

Prepare the raw data for the planned experiment:

python prepare_dataset.py --sampling_rate 16000 --scale 4 --window_size 8192 --window_stride 4096 --batch_size 128 --interpolate --low_pass --out_dir 'datasets'

Train

Train model on h5 files for 150 epochs (update NUM_EPOCHS inside run.py)

cd ../../src
python run.py

Visualize

It will create 4 times lower resolution example then pass it though model and create predicted wav file.

Pass any WAV file in high-resolution, for example p270_002.wav:

python visualize.py --model best_model_V1_6.pth --wav p270_002.wav --out output

Audio results and spectrograms will be available in /visualizations/output/.

About

Audio super resolution

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages