Skip to content

w1018979952/P2VANet

Repository files navigation

P2VANet

Public-Private Attributes-Based Variational Adversarial Network for Audio-Visual Cross-Modal Matching

requirments

python 3.8 librosa 0.7.2 numpy 1.19.0 torch 1.4.0 torchvision 0.5.0

Get Dataset and Paper

VoxCeleb1

  • wav audio data, 1,251 people in total, 39 GB after decompression.

  • Baidu Cloud link: VoxCeleb1

  • Decompression command:

  • zip -s 0 split.zip --out unsplit.zip

  • unzip unslit.zip

  • Vox1 official website: VoxCeleb1

VoxCeleb2

  • MP4 video data, files include audio, total of 5,994 people, 255 GB after decompression.

  • Baidu Cloud link: VoxCeleb2

  • Decompression command:

  • zip -s 0 vox2_mp4_dev.zip --out unsplit.zip

  • unzip unslit.zip

  • Vox2 official website: VoxCeleb2

Contact

If you have any questions, please feel free to contact with me at Netizenwjx@foxmail.com.

Citation

@Article{Zheng2024Public,
  author={Zheng, Aihua and Yuan, Fan and Zhang, Haichuan and Wang, Jiaxiang and Tang, Chao and Li, Chenglong},
  journal={IEEE Transactions on Circuits and Systems for Video Technology}, 
  title={Public-Private Attributes-Based Variational Adversarial Network for Audio-Visual Cross-Modal Matching}, 
  volume={34},
  number={9},
  pages={8698-8709},
  note = {doi: 10.1109/TCSVT.2024.3390573},
  year={2024}}

About

Public-Private Attributes-Based Variational Adversarial Network for Audio-Visual Cross-Modal Matching

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages