What do deepfake speech detectors actually learn?

This repository contains the source code for the Interspeech 2026 submission: "What do deepfake speech detectors actually learn?"

It provides a framework for training, evaluating, and interpreting speech deepfake detectors using Integrated Gradients (IG) to understand what features (and artefacts) self-supervised learning (SSL) models rely on to make their decisions.

Project Structure

models/: Contains the model architectures (aasist.py, mhfa.py, sls.py, wavlm_aasist.py, wavlm_camhfa.py, wavlm_sls.py).
utils/: Utility scripts for data loading (asvspoof5_dataset.py), processing (audio_utils.py), metrics (metrics.py), and visualization/Integrated Gradients support (ig_utils.py, ig_visualization.py).
augmentation/: Code for augmenting speech data (RawBoost, Codec, NoiseFilter, etc.).
artefacts_check/: Tools to compute and analyze correlations and Equal Error Rates (EER) of specific audio artefacts (compute_artefact_correlations.py, compute_artefact_stats.py).
labelling_app/: A PHP-based web application for manually annotating/labeling audio features and artefacts.
scores/: Scripts for error rate calculation and model fusion scoring.
Root Scripts:
- train.py & eval.py: Main scripts for training and evaluating the deepfake detectors.
- compute_ssl_means.py: Script to compute mean SSL representations.
- generate_ig_plots.py & plot_combined_ig.py: Scripts for generating Integrated Gradients visualizations.
- download_model.py: Script to optionally pull base models/weights.

Usage

Environment Setup: Initialize the required dependencies using the provided environment scripts:
```
# Linux/macOS
source env/setup_env.sh
# Windows
env\setup_env.bat
```
(Alternatively, install packages directly from env/requirements.txt)
Training: Configure your paths in config.py and run:
```
python train.py --model camhfa
```
Arguments for train.py:
- --model: Model architecture to train (sls, camhfa, or aasist). [Required]
- --data_dir: Path to ASVspoof5 root directory (defaults to config.DATA_DIR).
- --train_protocol: Train protocol filename (defaults to config.TRAIN_PROTOCOL).
- --dev_protocol: Dev protocol filename (defaults to config.DEV_PROTOCOL).
- --output_dir: Directory to save checkpoints (defaults to config.OUTPUT_DIR).
- --epochs: Number of training epochs (default: 10).
- --batch_size: Batch size per GPU (default: 4).
- --device: Computation device (cuda or cpu).
- --augment: Flag to apply data augmentation.
- --freeze_wavlm: Flag to freeze the WavLM backbone entirely.
- --warmup_epochs: Number of epochs to freeze WavLM before switching to end-to-end training (default: 0).
Generating Integrated Gradients (IG) Explanations: Visualize the learned representations using IG plots:
```
python generate_ig_plots.py --input_csv selections.csv --audio_dir /path/to/flac/
```
Arguments for generate_ig_plots.py:
- --input_csv: Path to a CSV or text list containing FileID strings to process. [Required]
- --audio_dir: Root directory containing the audio .flac files. [Required]
- --output_dir: Directory to save the output .png plots and interactive .json data (default: outputs/plots).
- --models_dir: Directory containing the .pt model checkpoints (default: models).

Paper Context

This repository provides the experimental framework addressing the core question of our IS26 submission: establishing exactly what cues state-of-the-art self-supervised and end-to-end deepfake systems utilize when distinguishing bona fide speech from spoofed audio.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

What do deepfake speech detectors actually learn?

Project Structure

Usage

Paper Context

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
artefacts_check		artefacts_check
augmentation		augmentation
env		env
labelling_app		labelling_app
models		models
scores		scores
utils		utils
.gitignore		.gitignore
README.md		README.md
compute_ssl_means.py		compute_ssl_means.py
config.py		config.py
download_model.py		download_model.py
eval.py		eval.py
generate_ig_plots.py		generate_ig_plots.py
generate_silence_profile.py		generate_silence_profile.py
plot_combined_ig.py		plot_combined_ig.py
test.py		test.py
train.py		train.py

Folders and files

Latest commit

History

Repository files navigation

What do deepfake speech detectors actually learn?

Project Structure

Usage

Paper Context

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages