Skip to content

w1018979952/AICANet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 

Repository files navigation

AICANet

requirments

python 3.8
librosa 0.7.2
numpy 1.19.0
torch 1.4.0
torchvision 0.5.0

Get Dataset and Paper

VoxCeleb1

  • wav audio data, 1,251 people in total, 39 GB after decompression.
    Baidu Cloud link: VoxCeleb1

  • Decompression command:
    zip -s 0 split.zip --out unsplit.zip
    unzip unslit.zip

  • Vox1 official website: VoxCeleb1

VoxCeleb2

  • MP4 video data, files include audio, total of 5,994 people, 255 GB after decompression.
    Baidu Cloud link: VoxCeleb2

  • Decompression command:
    zip -s 0 vox2_mp4_dev.zip --out unsplit.zip
    unzip unslit.zip

  • Vox2 official website: VoxCeleb2

Contact

If you think this toolkit or the results are helpful to you and your research, please cite us!

If you are interested in our mission, you can contact us for data sharing.

@article{wang2025adaptive,
  title={Adaptive Interaction and Correction Attention Network for Audio-Visual Matching},
  author={Wang, Jiaxiang and Zheng, Aihua and Liu, Lei and Li, Chenglong and He, Ran and Tang, Jin},
  journal={IEEE Transactions on Information Forensics and Security},
  volume={20},
  number={},
  pages={7558-7571},
  year={2025},
  publisher={IEEE}
}

Contact

About

Adaptive Interaction and Correction Attention Network for Audio-Visual Matching

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors