eegFloss

— A Python package to floss out artifacts from sleep EEG recordings

Overview

EEG data often contains artifacts caused by both internal factors (such as device issues) and external influences (such as movement or environmental noise). In sleep research, these artifacts frequently go unnoticed or unaddressed, which can impair the performance and reliability of data-driven models or analyses, especially automatic sleep-stage scoring models, and reduce the credibility of study outcomes.

Most existing artifact detection methods rely on threshold-based techniques. While easy to implement, these methods often struggle to detect complex or unfamiliar artifacts and typically lack generalizability across datasets.

eegFloss addresses this challenge with eegUsability—a machine learning (ML) model designed to detect artifact-contaminated EEG segments in sleep recordings. The model was trained and evaluated on manually artifact-labeled EEG data from 15 participants, collected over 127 nights using the Zmax wearable sleep EEG headband. However, it can be applied to sleep EEG data from any device to detect common artifacts and assess data usability, provided the sleep data adhere to the American Academy of Sleep Medicine (AASM) standards and definitions.

The package also includes eegMobility—an ML model that detects the degree of movement throughout the night based on Zmax accelerometer data. This information is used to automatically detect Time-in-Bed (TIB). For further details, please refer to the associated paper.

Installation

It is recommended to use eegFloss within a dedicated Anaconda or Miniconda environment. Follow these steps:

Download and install Anaconda or Miniconda for your operating system.
Download eegFloss, extract the compressed file, and place it in a suitable and accessible directory (with writing permission).
On Linux, ensure that the appropriate graphics driver is installed and hardware acceleration is enabled.
Launch the Anaconda Prompt:
- Windows: Search for "Anaconda Prompt" in the Start menu.
- Linux: Open a terminal and run: source ~/anaconda3/bin/activate or source ~/miniconda3/bin/activate.
- macOS: Open a terminal.
In the prompt, navigate to the extracted eegFloss directory (the directory of the .py files). Example:
cd D:\Folder1\eegFloss-base\
Create a new environment named eegFloss with all the necessary packages:
conda env create --name eegFloss --file eegFloss_dependencies.yml

You need to permit the installation by entering y in the prompt (and later may also need to provide an administrative password).
Once the environment is created, activate it:
conda activate eegFloss
To start coding:
- Launch Spyder and manually open the scripts: spyder
- Or, use your preferred code editor and run the script from the Anaconda Prompt. Example:
  python 1.eegFloss_check_usability_mobility.py

Script Descriptions

`1.eegFloss_check_usability_mobility.py`

Detects artifacts in sleep EEG data using a chosen eegUsability model.
Aggregates the provided sleep scores with usability scores (the outcomes of artifact detection) using a majority rule (an epoch is marked unusable if more than half of its constituent segments are unusable) to generate artifact-rejected sleep scores.
Automatically identifies Lights Out and Lights On moments using a chosen eegMobility model and computes Time-in-Bed (TIB).
Computes common sleep statistics based on the artifact-rejected sleep scores and TIB.
Generates visualizations such as usability graphs (shows channel-wise data usability) and hypnograms (shows the overall outcomes) to better illustrate the model's outputs.

Here is a streamlined overview of the script’s internal workflow:

Figure 1: A simplified workflow of 1.eegFloss_check_usability_mobility.py.

`2.eegFloss_spiky_noise_filter.py`

Identifies the presence of Spiky artifacts in EEG recordings.
If detected, it applies a custom filter to remove the artifact.
Saves the cleaned data to a new file (which can then be sleep-scored).

`3.eegFloss_file_cleanup.py`

Deletes intermediate or unnecessary files generated during processing to reduce clutter.

`eegFloss_functions.py`

Contains all imports and helper functions required by the main scripts.
Modify this file only if you need to customize core functionalities.

`read_outputs_in_Matlab.m`

Shows how to read various eegFloss output files in MATLAB.

Please read the comments in the input-output cells of each script carefully before running the script.

Read before Execution

File Type: EDF & BDF

eegFloss currently supports only EDF and BDF files. Therefore, raw EEG signal(s) must be stored in the EDF/BDF format.
If your data is in a different format, check whether the associated software suite of your recording device allows exporting as or converting data to EDFs.
If not, you can manually convert data using Python libraries such as PyEDFlib or MNE.

Sleep-Stage Scoring

eegFloss does not include a built-in automatic sleep-stage scorer and cannot infer sleep stages from EEG or other signals.
However, if you provide sleep scores alongside your data, it can generate artifact-rejected sleep scores by combining the provided sleep scores and the detected data usability.
If your data is not manually scored, consider using an open-source automatic sleep scorer, such as U-Sleep, YASA, SomnoBot, or another similar model using SleepyLand (for PSG data), or Dreamento or ezscore-f (for Zmax data).

Sleep Score Format

Sleep stages are expected to be labeled as 0 = Wake, 1 = N1, 2 = N2, 3 = N3, and 4 or 5 = REM. Deviating from this convention will result in incorrect visualizations and sleep statistics.
The sleep scores must be stored in the first column of a TXT/CSV file located in the same directory as the corresponding EDF file. The number of epochs must match the recording duration.

Data Organization

Each recording should reside in a separate directory. Placing multiple recordings in the same directory will result in only one file being processed by eegFloss, with the rest ignored. Provide the parent directory as the Raw_Data_Dir.

Output Directory Management

While it is possible to save eegFloss outputs in the same directory as the recordings by setting Output_Dir = Raw_Data_Dir, this is not recommended.
eegFloss checks for prior outputs to skip redundant processing. So, consider saving the additional outputs, even if they are not needed for your analysis.
If you later modify key settings (e.g., change the usability model or adjust TIB thresholds), the tool may incorrectly skip reprocessing due to existing outputs. To prevent this, use a separate output directory for each round of processing.
Feature extraction is typically the most time-consuming step (unless using a ‘lite’ model). To avoid recomputation, copy the eegFloss_stat_features.npz files to the data directory alongside the EDF files after processing the data once.

TIB Detection and Accelerometer Requirements

Automatic TIB detection using the eegMobility model is validated only for Zmax data. The model's training data is publicly available at Kaggle.com/datasets/niloy333/eegmobility-dataset.
If you want to test it for another device, ensure that the tri-axial accelerometer data is measured in units of g, falls within a range of ±2g (clip extreme values if needed), and includes gravitational acceleration (meaning the normalized data should center around 1g).
Analyzing EEG data without accompanying accelerometer signals may lead to the removal of some arousals due to a lack of motion information.

Miscellaneous

Make sure your device is connected to the internet before executing the scripts.
For non-Zmax devices, verify that sampling rates are correct. If initial results are suboptimal, consider applying normalization techniques.
Thoroughly read and update all the fields in the input-output cells according to your dataset and desired outputs before running the script.
The package has been tested on Windows 10 and 11, Ubuntu 24.04.2, and macOS Sequoia 15.3.1 (MacBook Air, 2018).

Known issue on Linux: Could not initialize GLX. To solve this, ensure that step 3 of Installation was done correctly. Then try (one by one):

 pip install PyQtWebEngine
 QT_XCB_GL_INTEGRATION=none
 QT_DEBUG_PLUGINS=1
 QT_QPA_PLATFORM=wayland spyder
 QT_QPA_PLATFORM=xcb spyder
 QT_QPA_PLATFORM=offscreen spyder

On non-Conda environments, if you face issues with missing packages, try installing the packages listed on eegFloss_dependencies.yml manually in the given order using pip install [package_name]==[version].

Primary Artifacts

eegUsability detects the following artifacts in raw sleep EEG data:

Figure 2: (a) A windowed spectrogram (blue: low power, red: high power) of a sample Zmax EEG channel, highlighting segments containing different artifacts. The corresponding time-domain representations of these segments are shown for (b) Good Data, (c) No Data, (d) High Noise, (e) Spiky Noise, and (f) M-shaped Noise.

eegUsability Models

eegUsability version	Feature set(s)	Specialty	When to use	F1-score (%)	Processing time (8-hr night)^†
“v1.0” or “default”	Spectrogram and statistical	Combines two feature sets for consistent outputs. Tested across datasets and the most dependable.	Best for general tasks requiring maximum data retention.	84.87	≈35 sec
“v0.8” or “weighted-M”	Spectrogram and statistical	Better at identifying M-shaped Noise but sacrifices a bit more usable data.	Ideal when M-shaped Noise detection is crucial and slight data loss is acceptable.	86.35	≈36 sec
“v0.6” or “binary”	Spectrogram and statistical	Only identifies whether the data is usable or not; does not differentiate noise types.	If noise differentiation is entirely unnecessary or processing simplicity is prioritized.	89.4	≈35 sec
“v0.7” or “lite”	Spectrogram	Uses only one feature set; similar to v1.0, but 12 times faster with comparable results.	Suitable for quick results where minor inconsistencies are tolerable.	84.94	≈3 sec
“v0.7.2” or “lite weighted-M”	Spectrogram	Similar to v0.8, but works on only spectrogram features, hence is faster.	Optimal for quick, precise outputs.	86.37	≈3 sec
“v0.7.3” or “lite binary”	Spectrogram	Similar to v0.6 but works on only spectrogram features, hence is faster.	Handy when fast results are needed without noise type differentiation.	89.29	≈3 sec
“v0.9” or “full”	Spectrogram and statistical	Similar to v1.0 but is trained on the entire available dataset.	Can be used if a more hypertuned model is needed.	90.11^{^}	≈37 sec

^{^}Results are from a test set that is a subset of the training data.
^†Tested on a Core i7, 8C/16T, 2.5–4.8 GHz processor with no resource-intensive processes running in parallel.

Sample Outputs

Usability Graph

Figure 3: The usability graph of a sample Zmax recording showing (a) a windowed spectrogram of the EEG Left channel, (b) its usability scores, (c) the normalized acceleration calculated from tri-axial ACC data, (d) a windowed spectrogram of the EEG Right channel, and (e) its usability scores.

Hypnogram

Figure 4: eegFloss outputs of a sample Zmax recording showing spectrograms of (a) EEG Left and (b) EEG Right channels, (c) the normalized acceleration, (d) hypnogram based on the artifact-rejected autoscores, and (e) the mobility labels with TIB bounded by Lights Out and Lights On moments.

The sample_output folder contains a sample of all output files.

Reference Paper

More information on this package and the underlying models can be found in:

Sikder, N., Zerr, P., Jafarzadeh Esfahani, M., Dresler, M., & Krauledat, M. (2025). eegFloss: A Python package for refining sleep EEG recordings using machine learning models. arXiv. https://doi.org/10.48550/arXiv.2507.06433.

Read on ResearchGate

Cite

If you find this package helpful and use it in your work, please cite the reference paper as:

@article{sikder2025eegfloss,
  title     = {eegFloss: A Python package for refining sleep EEG recordings using machine learning models},
  author    = {Sikder, Niloy and Zerr, Paul and Jafarzadeh Esfahani, Mahdad and Dresler, Martin and Krauledat, Matthias},
  journal   = {arXiv preprint arXiv:2507.06433},
  year      = {2025},
  doi       = {10.48550/arXiv.2507.06433},
  url       = {https://arxiv.org/abs/2507.06433},
}

And cite the package as:

@software{sikder2025eegflossv1,
  author    = {Niloy Sikder},
  title     = {eegFloss},
  year      = {2025},
  publisher = {Zenodo},
  doi       = {10.5281/zenodo.15823969},
  url       = {https://doi.org/10.5281/zenodo.15823969},
}

People

© Niloy Sikder^1,2,#, Paul Zerr^1,$, Martin Dresler^1,$, & Matthias Krauledat^2,$
¹Radboud University Medical Center, Donders Institute for Brain, Cognition and Behaviour, Nijmegen, The Netherlands.
²Faculty of Technology and Bionics, Rhine-Waal University of Applied Sciences, Kleve, Germany.
^#Developer
^$Supervisor

Opportunities for Collaboration

eegFloss can currently process only sleep EEG data, and so far, it's been validated only on recordings from the Zmax headband. However, the package offers many exciting opportunities for improvement and expansion, including:

Improving the artifact detection model with additional training data
Integrating an automatic sleep-stage scorer to create a complete sleep analysis pipeline
Validating performance on non-Zmax datasets
Extending support to artifact detection in wake EEG
Developing a user-friendly Graphical User Interface (GUI)

We are exploring some of these work packages, but we can achieve much more through active collaboration. If this sounds interesting, and you would like to get involved, please feel free to reach out.

This package is provided as is, without any warranties, express or implied. eegFloss is released under the MIT License and is free to use, modify, and integrate with other software, provided that appropriate credit is given.

For questions, assistance, suggestions, or further information: contact the developer.

Name		Name	Last commit message	Last commit date
Latest commit History 189 Commits
figures		figures
sample_data/Zmax/sub01		sample_data/Zmax/sub01
sample_output/Zmax/sub01		sample_output/Zmax/sub01
z.extras		z.extras
1.eegFloss_check_usability_mobility.py		1.eegFloss_check_usability_mobility.py
2.eegFloss_spiky_noise_filter.py		2.eegFloss_spiky_noise_filter.py
3.eegFloss_file_cleanup.py		3.eegFloss_file_cleanup.py
LICENSE		LICENSE
README.md		README.md
eegFloss_dependencies.yml		eegFloss_dependencies.yml
eegFloss_functions.py		eegFloss_functions.py
index.html		index.html
no_acc.npy		no_acc.npy
read_outputs_in_Matlab.m		read_outputs_in_Matlab.m

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

eegFloss

Overview

Installation

Script Descriptions

`1.eegFloss_check_usability_mobility.py`

`2.eegFloss_spiky_noise_filter.py`

`3.eegFloss_file_cleanup.py`

`eegFloss_functions.py`

`read_outputs_in_Matlab.m`

Read before Execution

File Type: EDF & BDF

Sleep-Stage Scoring

Sleep Score Format

Data Organization

Output Directory Management

TIB Detection and Accelerometer Requirements

Miscellaneous

Primary Artifacts

eegUsability Models

Sample Outputs

Usability Graph

Hypnogram

Reference Paper

Cite

People

Opportunities for Collaboration

About

Uh oh!

Releases 1

Packages

Languages

License

Niloy333/eegFloss

Folders and files

Latest commit

History

Repository files navigation

eegFloss

Overview

Installation

Script Descriptions

1.eegFloss_check_usability_mobility.py

2.eegFloss_spiky_noise_filter.py

3.eegFloss_file_cleanup.py

eegFloss_functions.py

read_outputs_in_Matlab.m

Read before Execution

File Type: EDF & BDF

Sleep-Stage Scoring

Sleep Score Format

Data Organization

Output Directory Management

TIB Detection and Accelerometer Requirements

Miscellaneous

Primary Artifacts

eegUsability Models

Sample Outputs

Usability Graph

Hypnogram

Reference Paper

Cite

People

Opportunities for Collaboration

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

`1.eegFloss_check_usability_mobility.py`

`2.eegFloss_spiky_noise_filter.py`

`3.eegFloss_file_cleanup.py`

`eegFloss_functions.py`

`read_outputs_in_Matlab.m`

Packages