hilo/data/dataset/README.md at main · WHULACC/hilo · GitHub

4 lines (3 loc) · 522 Bytes

The links to the acoustic and visual features are here:

audio_embedding_6373.npy: the embedding table composed of the 6373-dimensional acoustic features of each utterances extracted with openSMILE
video_embedding_4096.npy: the embedding table composed of the 4096-dimensional visual features of each utterances extracted with 3D-CNN