Build it in the way it is in github, the visqol results used, and the visqol results implemented in matlab are the same in audio mode, but different in speech mode. What is the problem and which one is correct? The dataset used is the voice bank demand dataset.
The links in that dataset are as follows: https://datashare.ed.ac.uk/handle/10283/2791
I used a test set here.
result: matlab - 3.59
github - 2.09