AICANet

requirments

python 3.8
librosa 0.7.2
numpy 1.19.0
torch 1.4.0
torchvision 0.5.0

Get Dataset and Paper

Download the VoxCeleb, VGGFace
Latest Paper List Audio-visual matching

VoxCeleb1

wav audio data, 1,251 people in total, 39 GB after decompression.
Baidu Cloud link: VoxCeleb1
Decompression command:
zip -s 0 split.zip --out unsplit.zip
unzip unslit.zip
Vox1 official website: VoxCeleb1

VoxCeleb2

MP4 video data, files include audio, total of 5,994 people, 255 GB after decompression.
Baidu Cloud link: VoxCeleb2
Decompression command:
zip -s 0 vox2_mp4_dev.zip --out unsplit.zip
unzip unslit.zip
Vox2 official website: VoxCeleb2

Contact

If you think this toolkit or the results are helpful to you and your research, please cite us!

If you are interested in our mission, you can contact us for data sharing.

@article{wang2025adaptive,
  title={Adaptive Interaction and Correction Attention Network for Audio-Visual Matching},
  author={Wang, Jiaxiang and Zheng, Aihua and Liu, Lei and Li, Chenglong and He, Ran and Tang, Jin},
  journal={IEEE Transactions on Information Forensics and Security},
  volume={20},
  number={},
  pages={7558-7571},
  year={2025},
  publisher={IEEE}
}

Contact

Jiaxiang Wang: Netizenwjx@foxmail.com

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AICANet

requirments

Get Dataset and Paper

VoxCeleb1

VoxCeleb2

Contact

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

AICANet

requirments

Get Dataset and Paper

VoxCeleb1

VoxCeleb2

Contact

Contact

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages