Multivariate Probabilistic Assessment of Speech Quality

This is the official implementation of "Multivariate Probabilistic Assessment of Speech Quality". Multivariate Probabilistic Assessment of Speech Quality provides a model that takes as input a speech clip, and outputs a multivariate Gaussian distribution over five speech quality dimensions. The dimensions are the overall speech quality (MOS), the intrusivness of the noise (NOI), the coloration quality (COL), the discontinuity quality (DIS), and the loudness (LOUD). All notations follow the definitions given in the NISQA Corpus.

Authors: Fredrik Cumlin, Xinyu Liang
Emails: fcumlin@gmail.com, hopeliang990504@gmail.com

Inference

Please see example in example_inference.py. This script can be used on single wav files, e.g.:

python example_inference.py \
  --wav_path 'path/to/audio_to_be_processed.wav' \
  --model runs/multigauss/model.pt

Dependencies

See requirements.txt.

Dataset preparation

Download the NISQA dataset: NISQA Corpus
Run the preprocessing script preprocess/generate_ssl_features.sh to preprocess the NISQA datasets with wav2vec 2.0. Example:

./preprocess/generate_ssl_features.sh 'path/to/NISQA_Corpus'

Please note that the default configuration will only save the features from the 12th layer (index 11). Hence, other layers specified in the Gin configuration during training will not work.

Training

The framework is Gin configurable; specifying model and dataset is done with a Gin config. See examples in configs/*.gin.

Set dataset.NisqaFeatures.data_path to the folder of the NISQA Corpus (either in the Gin config or default in the dataset library). Should be similar to path/to/datasets/NISQA_Corpus.
Launch a training with a specified Gin config.

Example launch:

python train.py --gin_path configs/probabilistic.gin --save_path runs/multigauss_probabilistic

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
configs		configs
preprocess		preprocess
runs		runs
LICENSE		LICENSE
README.md		README.md
calculate_results.py		calculate_results.py
dataset.py		dataset.py
example_inference.py		example_inference.py
model.py		model.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multivariate Probabilistic Assessment of Speech Quality

Inference

Dependencies

Dataset preparation

Training

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Multivariate Probabilistic Assessment of Speech Quality

Inference

Dependencies

Dataset preparation

Training

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages