This is the official repository for our state-of-the-art audio identification framework based on graph neural networks. We demonstrate the code usage for training, audio fingerprint generation and evaluation. For more details, refer to the paper at Interanational Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2025.
- Clone the repository:
git clone https://github.com/username/GraFP.git cd GraFP - Install the required Python packages:
pip install -r requirements.txt
As per our experiments, we recommend using the fma-small subset of the Free Music Archive (FMA) dataset. For the noise and room impulse response (RIR) dataset, we recommend using the MUSAN dataset and the Aachen Impulse Response database, respectively.
- Setup the config files with paths to datasets
python setup_config.py --train_dir /PATH/TO/TRAIN/DATA --val_dir /PATH/TO/VALIDATION/DATA --noise_dir /PATH/TO/NOISE/DATA --ir_dir /PATH/TO/IR/DATA
- Run the training script:
python train.py
We provide a helper code for generating audio fingerprints for a given audio dataset. The pre-trained models are available here. The primary evaluation benchmarks have been computed using model_tc_29_best.pth.
python generate.py --test_dir /PATH/TO/TEST/DATA --ckp /PATH/TO/MODELFor reproducibility, we have made the dummy fingerprint database available here. The fingerprint retrieval pipeline utilizes the FAISS library for the approximate nearest-neighbour (ANN) search in the fingerprint embedding space. Further details about the ANN implementation is available in our pre-print document. The icassp.sh script can be used to run the evaluation pipeline for reproducing the published results.
- Download and extract the test dataset. The script supports evaluation on both the
fma-mediumandfma-largedatasets. Note that extracting the compressedfma_large.zipcan take a while. For quicker evaluation runs, we recommend extracting thefma_medium.zip. - Download and extract the augmentation dataset. Queries are created using subset of the background noise and impulse response datasets. They can be downloaded here.
- Run the evaluation script with the pre-trained model:
bash icassp.sh /PATH/TO/EVAL/DATASET /PATH/TO/AUG/DATASET
Note that the evaluation dataset path provided in the above script should be the absolute path to the directory called fma_medium or fma_large. Logs such as raw outputs and retrieval hit-rates can be found in the logs/store directory. Each output run is organized according the filename of the pre-trained model used. Support for running evaluation on private datasets would be made available soon.
If you use this code in repository, please cite our paper:
@inproceedings{grafprint2025,
title={GraFPrint: A GNN-Based Approach for Audio Identification},
author={Bhattacharjee, Aditya and Singh, Shubhr and Benetos, Emmanouil},
booktitle={ICASSP 2025-2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
pages={1--5},
year={2025},
organization={IEEE}
}