NeuralSampleID: A Framework for Automatic Sample Identification

NeuralSampleID is a lightweight and scalable framework for automatic sample identification (ASID), the task of detecting and retrieving music samples embedded within audio queries. This system:

Uses a self-supervised Graph Neural Network (GNN) encoder trained with contrastive learning.
Includes a cross-attention classifier that refines and ranks retrieval results.
Benchmarks performance using fine-grained annotations from an extended version of the Sample100 dataset.

Our method achieves SOTA with only 9% of the parameters used by prior systems. For more details, please see the preprint and the documentation in this repository.

Our work has been accepted to ISMIR 2025!
Check out the preprint here.

This repository contains the official implementation for the paper:

"Refining Music Sample Identification with a Self-Supervised Graph Neural Network"
A. Bhattacharjee, I. Meresman Higgs, M. Sandler, and E. Benetos
To appear in the Proceedings of the 26th International Society for Music Information Retrieval Conference (ISMIR), 2025

Installation

# Clone the repository
git clone https://github.com/chymaera96/NeuralSampleID.git
cd NeuralSampleID

# Install dependencies
pip install -r requirements.txt

# Install FAISS GPU for faster evaluation
conda install faiss-gpu -c pytorch

# Install DGL (cuda version specific)
conda install -c dglteam/label/th24_cu121 dgl

Dataset Preparation

The models are trained using the fma_medium subset of the Free Music Archive (FMA) dataset. The audio files are first preprocessed into source-separated stems; specifically vocal,drum,bass,other stems. For this, we use HTDemucs [cite]. For the training setup, the source separated audio files should follow the following directory structure.

htdemucs/
├── 12345/
│   ├── vocals.mp3
│   ├── drums.mp3
│   ├── bass.mp3
│   └── other.mp3
├── 12346/
│   ├── vocals.mp3
│   ├── drums.mp3
│   ├── bass.mp3
│   └── other.mp3
├── ...

Each subfolder (e.g., 12345) corresponds to a unique FMA track ID and contains the separated stem files in .mp3 format.

We use the our extended annotations of the Sample100 dataset -- sample100-ext for retrieval evaluation. Details of the dataset can be found in the dataset README. Evaluation audio files have not been shared as a part of this work. Instead, we provide the fingeprints computed using our setup for queries and reference database.

Pretraining

The pretraining step uses contrastive learning of the Graph Neural Network backbone.

# Pre-training the proposed model
python train.py --config config/grafp.yaml --ckp CKP_NAME
# (Single-stage) training of the baseline mode
cd baseline
python train.py --config config/resnet_ibn.yaml --ckp CKP_NAME

Key arguments:

--config: YAML config file path
--ckp: Placeholder name for the training run

Note: Update the paths (particularly, htdemucs_dir and fma_dir) in the YAML file to point at the directory containing the source-separated and mixed audio data for training. If you want to resume from a checkpoint, use --resume path/to/checkpoint.pth.

Classifier Training

After pretraining, you can fine-tune the MHCA classifier on the learned embeddings (fingerprints).

python downstream.py --enc_wts ENCODER_CHECKPOINT

Evaluation

Given a query set, the evaluation process compares the retrieval rates and mean average precision (mAP).

# Usage: ./ismir25.sh [baseline|proposed]

# Example: Evaluate the proposed model
bash ismir25.sh proposed

# Example: Evaluate the baseline model
bash ismir25.sh baseline

The script ismir25.sh handles running evaluation with the appropriate model to reproduce published benchmarks. A detailed demonstration of evaluation on custom datasets will be updated soon!

Pretrained Models and Fingeprints

Citation

If you use this code or the dataset in your research, please cite our paper:

Bhattacharjee, A., Meresman Higgs, I., Sandler, M., & Benetos, E. (2025). Refining Music Sample Identification with a Self-Supervised Graph Neural Network. In Proceedings of the 26th International Society for Music Information Retrieval Conference (ISMIR). Daejeon, South Korea.

@inproceedings{bhattacharjee2025refining,
  title={Refining Music Sample Identification with a Self-Supervised Graph Neural Network},
  author={Bhattacharjee, Aditya and Meresman Higgs, Ivan and Sandler, Mark and Benetos, Emmanouil},
  booktitle={Proceedings of the 26th International Society for Music Information Retrieval Conference (ISMIR)},
  year={2025},
  address={Daejeon, South Korea},
  publisher={ISMIR},
  note={Preprint available at \url{https://www.arxiv.org/abs/2506.14684}}
}

For issues or questions, please open an Issue.

Name		Name	Last commit message	Last commit date
Latest commit History 603 Commits
baseline		baseline
checkpoint		checkpoint
config		config
data		data
encoder		encoder
modules		modules
sample100-ext		sample100-ext
simclr		simclr
LICENSE		LICENSE
README.md		README.md
ablation.py		ablation.py
downstream.py		downstream.py
eval.py		eval.py
eval_hr.py		eval_hr.py
eval_map.py		eval_map.py
fx_util.py		fx_util.py
generate.py		generate.py
ismir25.sh		ismir25.sh
peak_extractor.py		peak_extractor.py
query.py		query.py
requirements.txt		requirements.txt
resnet_script.sh		resnet_script.sh
test_fp.py		test_fp.py
train.py		train.py
util.py		util.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NeuralSampleID: A Framework for Automatic Sample Identification

Table of Contents

Installation

Dataset Preparation

Pretraining

Classifier Training

Evaluation

Pretrained Models and Fingeprints

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

NeuralSampleID: A Framework for Automatic Sample Identification

Table of Contents

Installation

Dataset Preparation

Pretraining

Classifier Training

Evaluation

Pretrained Models and Fingeprints

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages