This repository provides a collection of methods for Out-of-Domain (OOD) detection implemented in Python. The library includes implementations of various algorithms for identifying data points that differ significantly from the training distribution.
- Multiple OOD detection methods:
- One-Class SVM
- P-Sphere Hull
- Standardized pipeline for comparing methods
- Batch processing of multiple datasets
- Comprehensive metrics reporting
# Clone the repository
git clone https://github.com/hrueda25/ood-lib.git
cd ood-lib
# Install dependencies
pip install -r requirements.txtBelow are various ways to run the OOD detection pipeline with different configurations.
# Run One-Class SVM on the abalone dataset
python src/ood_pipeline.py --method ocsvm --dataset abalone
# Run P-Sphere Hull on the iris dataset
python src/ood_pipeline.py --method psphere --dataset iris# Run all implemented methods on the abalone dataset
python src/ood_pipeline.py --method all --dataset abalone# Run One-Class SVM on all datasets in the data folder
python src/ood_pipeline.py --method ocsvm --dataset all# Customize One-Class SVM parameters
python src/ood_pipeline.py --method ocsvm --dataset abalone --nu 0.05 --kernel sigmoid
# Customize P-Sphere Hull parameters
python src/ood_pipeline.py --method psphere --dataset iris --n_clusters 50 --ps_vratio_filt# Change the test set size to 30%
python src/ood_pipeline.py --method ocsvm --dataset abalone --test_size 0.3# Save results to a specific file
python src/ood_pipeline.py --method all --dataset abalone --output results/my_experiment.csv# Run all methods on all datasets with custom parameters
python src/ood_pipeline.py --method all --dataset all --nu 0.05 --kernel rbf --n_clusters 50The pipeline expects datasets in ARFF format, which can be placed in the data/raw folder. For datasets with missing values, the pipeline will handle them by replacing NaN values with mean values.
Results are saved as CSV files in the results folder. The filenames include the method and dataset information, along with key parameters used in the experiment.
Example output filename:
ood_results_dataset-iris_method-ocsvm_nu-0.1_kernel-rbf_test-0.2.csv
The standardized data is also saved in HDF5 format in the data/input folder for future use.
To add a new OOD detection method:
- Create a new file in the
src/methodsdirectory - Implement your method with the standard API similar to existing methods
- Update the main pipeline
src/ood_pipeline.pyto include your new method
This project is licensed under the BSD 3-Clause License - see the LICENSE file for details.
If you use this library in your research, please cite:
@software{ood_library,
author = {Héctor Rueda and Ben Mathiensen},
title = {Out of Domain Detection Library},
year = {2025},
url = {https://github.com/hrueda25/ood-lib}
}