Hi, can you please share the procedure to prepare the VOiCES dataset for evaluating the VAD models? Is there any documentation already available that explains how to generate the .wav files and the corresponding labels (0/1's)? At the higher level, the folder was like this for the VOiCES dataset.
