ASD Harmonization

Overcoming Site Variability in Multisite fMRI Studies: An Autoencoder Framework for Enhanced Generalizability of Machine Learning Models

Abstract

Multisite functional magnetic resonance imaging (fMRI) studies are vital for advancing our understanding of brain disorders. However, site variability poses a significant challenge to the generalizability of machine learning models. In this work, we propose an autoencoder (AE) framework to mitigate site-specific variations and improve model performance across diverse datasets. By leveraging AEs, we demonstrate enhanced generalizability and robustness in ASD classification tasks. This framework is evaluated on multisite fMRI data, highlighting its ability to overcome site variability and set a new standard for harmonization techniques in neuroimaging studies.

Citation

If you use this code or models in your research, please cite our paper:

Almuqhim, F., Saeed, F. Overcoming Site Variability in Multisite fMRI Studies: an Autoencoder Framework for Enhanced Generalizability of Machine Learning Models. Neuroinform 23, 46 (2025). https://doi.org/10.1007/s12021-025-09746-1

@article{almuqhim_overcoming_2025,
	title   = {Overcoming Site Variability in Multisite fMRI Studies: An Autoencoder Framework for Enhanced Generalizability of Machine Learning Models},
	volume = {23},
	issn = {1559-0089},
	url = {https://doi.org/10.1007/s12021-025-09746-1},
	doi = {10.1007/s12021-025-09746-1},
	number = {3},
	journal = {Neuroinformatics},
	author = {Almuqhim, Fahad and Saeed, Fahad},
	month = sep,
	year = {2025},
	pages = {46},
}

System Requirements

A computer with Ubuntu 16.04 (or later) or CentOS 8.1 (or later).
CUDA-enabled GPU with at least 12 GB of memory.

Installation Guide

Install Anaconda

Step by Step Guide to Install Anaconda

Fork the Repository

Fork this repository to your own account.
Clone your fork to your machine.

Create a Conda Environment

cd <repository_directory>
conda env create --file environment.yml

Activate the Environment

conda activate ae_env

Install Pytorch and pyprind

pip install pyprind
pip install torch==1.13.1+cu116 torchvision==0.14.1+cu116 torchaudio==0.13.1+cu116 --extra-index-url https://download.pytorch.org/whl/cu116

Data Preparation

The script file, asd_harmonization_AEs.py, now uses a configuration file to set the paths for the required data. Update the config.ini file with the following paths:

DATA_PATH: Path to the raw fMRI data.
PHENO_PATH: Path to the phenotype (pheno) data.
SAMPLE_PATH: Path to the input sample file.

Usage

python asd_harmonization_AEs.py --p_method <p_method_value> [--ml_method <ml_method_value>] [--ae_type <ae_type_value>] [--run_combat] [--centers <center1,center2>] [--fold <fold_value>]

Run trained models

You can download the trained models and data splits from the latest release.

Then, you can run the following command to run TAE for proportional split:

python run_trained_harmonization_AEs.py --p_method ASD-Standalone-AE-10-fold --model_dir models/ --ae_type TAE

Options

--p_method (Required) Specifies the processing method for the pipeline.
Example values: ASD-Standalone-AE-10-fold, ASD-ml-combine-two-centers, ASD-ml-combine-two-centers-asd-vs-hc, ASD-ml, ASD-Standalone-AE
--ml_method (Optional) Specifies the machine learning method.
Default: RF.
Example values: RF, SVM, NB
--ae_type (Optional) Specifies the autoencoder type to use.
Default: AE.
Example values: AE, TAE, SAE, DAE
--run_combat (Optional) Enables ComBat harmonization in the pipeline.
When included, the pipeline applies ComBat. If not specified, ComBat is not used. Please note this is used when ComBat is in the experiment
--centers (Optional) Comma-separated list of two centers to include in the analysis.
Default: None.
Example: Pitt,Yale. Used for center classification experiments.
--fold (Optional) Specifies the fold number for cross-validation experiments. It only used in proportional split experiments not in the LOSO experiments
Default: 10.
Example values: 5, 10

Running the Experiments

Below is an example command for running the paper experiments:

Autoencoder harmonization using proportional split

python asd_harmonization_AEs.py --p_method ASD-Standalone-AE-10-fold --ae_type AE --ml_method RF

ML classification using with ComBat and without ComBat - proportional split

python asd_harmonization_AEs.py --p_method ASD-ml --ml_method RF --run_combat

Center classification using an ML method, like RF, NB and SVM. It will take ASD of first-center, NT of second-center and vice versa

python asd_harmonization_AEs.py --p_method ASD-ml-combine-two-centers-asd-vs-hc --ml_method RF --run_combat --fold 5 --centers UCLA,KKI

Center classification using NT subjects only from two given centers, and the model will classify the center with and without ComBat. no need to pass --run_combat because the method will iterate through both of them

python asd_harmonization_AEs.py --p_method ASD-ml-combine-two-centers --ml_method RF --run_combat --fold 5 --centers UCLA,KKI

Leave One Site Out experiment using AEs as harmonization method

python asd_harmonization_AEs.py --p_method ASD-Standalone-AE --ae_type AE --ml_method RF

Leave One Site Out experiment with and without ComBat

python asd_harmonization_AEs.py --p_method ASD-ml --ml_method RF --run_combat

Results and Analysis

Once the experiments are completed, Jupyter notebooks for result visualization and analysis can be found in the notebooks folder.
Open the relevant notebook and follow the instructions for plotting and evaluating results.

Additional Notes

This project uses the ComBat harmonization method, available at https://github.com/Jfortin1/ComBatHarmonization.

Contact

If you have any questions or encounter any issues, please reach out at falmu027@fiu.edu.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
__pycache__		__pycache__
sample_data		sample_data
.DS_Store		.DS_Store
README.md		README.md
applyCombat.py		applyCombat.py
asd_harmonization_AEs.py		asd_harmonization_AEs.py
config.py		config.py
data-files.md		data-files.md
environment.yml		environment.yml
plot_results.ipynb		plot_results.ipynb
run_trained_harmonization_AEs.py		run_trained_harmonization_AEs.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ASD Harmonization

Overcoming Site Variability in Multisite fMRI Studies: An Autoencoder Framework for Enhanced Generalizability of Machine Learning Models

Abstract

Citation

System Requirements

Installation Guide

Install Anaconda

Fork the Repository

Create a Conda Environment

Activate the Environment

Install Pytorch and pyprind

Data Preparation

Usage

Run trained models

Options

Running the Experiments

Results and Analysis

Additional Notes

Contact

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ASD Harmonization

Overcoming Site Variability in Multisite fMRI Studies: An Autoencoder Framework for Enhanced Generalizability of Machine Learning Models

Abstract

Citation

System Requirements

Installation Guide

Install Anaconda

Fork the Repository

Create a Conda Environment

Activate the Environment

Install Pytorch and pyprind

Data Preparation

Usage

Run trained models

Options

Running the Experiments

Results and Analysis

Additional Notes

Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages