Using Audacity to play VERSE audio files

Audacity is an open source audio editor that can be used to visualize audio files generated by VERSE. VERSE datasets contain multiple folders each one dedicated to a single audio scene, a single use case. Three files are included in each folder: the audio rendering in Matroska (.MKV) format, the .YAML descriptor of the audio file and the scene definition (.YAML).

Audio example

To start from a real example make sure you have rendered the "simple_example" dataset by running:

./render_dataset.py -i ../resources/ds_recipes/simple_example/info/simple_example.yaml

Results will be available under

[VERSE]/datasets/simple_example/train

Looking inside one of the rendere scenes we have:

cd 001301_dynamic_multivoice_0_1_1/
tree
.
├── 001301_dynamic_multivoice_0_1_1.yaml
├── dynamic_multivoice.mkv
└── dynamic_multivoice_mkv.yaml

The dinamic_multivoice.mkv is the container of all the audio artifacts for this audio scene.

VERSE audio descriptor

VERSE's audio files have a companion descriptor leveraging YAML syntax to describe the content of the file itself. In this case we have "dynamic_multivoice_mkv.yaml" which shows the following:

syntax:
  name: verse_audio_mkv

description: none
name: verse rendered audio scene

file: [VERSE]/datasets/simple_example/train/001301_dynamic_multivoice_0_1_1/dynamic_multivoice.mkv

sources_count: 3
sources:
  0:
    channels: 1
    file: 000056_gentlemenpreferblondes.wav
    track_id: 0
  1:
    channels: 1
    file: 000027_blackbuccaneer.wav
    track_id: 1
  2:
    channels: 1
    file: 000071_gianburrasca.wav
    track_id: 2

receivers_count: 8
receivers:
  0:
    channels: 2
    file: dynamic_multivoice_binaural_000.wav
    track_id: 3
  1:
    channels: 2
    file: dynamic_multivoice_array_six_front_001.wav
    track_id: 4
  2:
    channels: 2
    file: dynamic_multivoice_array_six_middle_002.wav
    track_id: 5
  3:
    channels: 2
    file: dynamic_multivoice_array_six_rear_003.wav
    track_id: 6

where [VERSE] is your local copy for VERSE repo.

The descriptor shows that the rendered audio does contain n.3 "sources", meaning human voices. These are the original (mono) voices that were used to rendere the audio scene. There are also 8 receivers (one listener with eight receivers, meaning 4 pairs of microphones).

The details of how the receivers are placed and how the sources move in space are described by the scene definition file, in this case the file: "001301_dynamic_multivoice_0_1_1.yaml".

For each receiver the desciptor indicates the number of channels (2) and the track number which will be useful for Audacity visualization.

Note that this track number is the same you get by using the command line tools "play_scene.py" with the "-l" option (listing tracks)

Loading in Audacity

The syntax for "scene" is explained in scene_syntax_howto.

For this scene the listener "head" has one pair of receivers placed in the ears (binaural) and a six-mic-array placed around the head, hence we have a front/middle/rear indication for the head mic array.

Open Audacity and load the .mkv file, you will be presented with a list of audio tracks to be imported, select all of them (use SHIFT and MOUSE-CLICK):

Next you will see the full list of audio tracks following the same track numbering order as for the .yaml descriptor.

The first three tracks are the audio sources used in this scene (they are mono audio files, different length). The last tracks are stereo (mic pairs) referring to binaura or array-six[front|middle|rear].

You can use MUTE/SOLO buttons and all the features of Audacity to compare, filter and play the audio tracks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using Audacity to play VERSE audio files

Audio example

VERSE audio descriptor

Loading in Audacity

FilesExpand file tree

audacity_howto.md

Latest commit

History

audacity_howto.md

File metadata and controls

Using Audacity to play VERSE audio files

Audio example

VERSE audio descriptor

Loading in Audacity