Skip to content

Train Once, Transfer Anywhere: Toward Device-Homogeneous MI-EEG Decoding

Notifications You must be signed in to change notification settings

comojin1994/eegmixer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Train Once, Transfer Anywhere: Toward Device-Homogeneous MI-EEG Decoding

This repository is the official implementations of EEGMixer in pytorch-lightning style:

D.-H. Lee, S.-J. Kim, H. Kong, and S.-W. Lee, "Train Once, Transfer Anywhere: Toward Device-Homogeneous MI-EEG Decoding," 2025. (Under Review)

Architecture

EEGMixer EEGMixer Block
Alt text Alt text

Abstract

Electroencephalogram (EEG) has emerged as a key modality, facilitating the development of brain-computer interface (BCI). Motor imagery (MI), one of the BCI paradigms, has garnered significant attention due to its dual role in motor rehabilitation and daily activity augmentation. Generalizing the decoding of MI-based EEG signals is essential for utilizing BCI systems in real-world environments. While transfer learning facilitates generalization by bridging structural differences across datasets, its deployment is hindered by the challenge of device heterogeneity. While recent studies have attempted to address this limitation, they often require additional preprocessing or dataset-specific architectural modifications. To address these, we propose the EEGMixer that eliminates the need for dataset-specific adaptation. The EEGMixer comprises three key innovations: i) the dynamic spatial hypernetwork for addressing the challenge of device heterogeneity by utilizing a temporally conditioned spatial weight, ii) the mosaic positional encoding that applies absolute and relative encodings along spatial and temporal domains to focus on domain-relevant information, and iii) the orchestration of domain information that extracts informative features by orchestrating EEG representations in spatial and temporal domains and subsequently integrates them to form a unified representation. The EEGMixer achieved competitive performances on each dataset and was extensively validated under six cross-dataset transfer settings across multiple datasets. These demonstrate that the EEGMixer is the first model to enable effective cross-dataset generalization without requiring dataset-specific architectural modifications. Notably, this is the first attempt to validate that a unified architecture can achieve the consistent transferability without the need for dataset-specific adaptation. Hence, we demonstrate the possibility of the EEGMixer to address the challenge of device heterogeneity and enable generalizable decoding across multiple datasets.

1. Performance Evalutions

Model BCIC IV–2a (Brunner et al. 2008) BCIC IV–2b (Leeb et al. 2008) Zhou (Zhou et al. 2016)
Acc Kappa F1-score Acc Kappa F1-score Acc Kappa F1-score
ShallowConvNet 0.5976 0.4630 0.5752 0.7558 0.5167 0.7475 0.6660 0.4998 0.6510
DeepConvNet 0.5756 0.4338 0.5640 0.7657 0.5235 0.7656 0.5135 0.2710 0.4911
EEGNet 0.6069 0.4755 0.5912 0.7457 0.5098 0.7457 0.6532 0.4806 0.6314
EEGConformer 0.5532 0.4039 0.5375 0.7391 0.4766 0.7333 0.7162 0.5910 0.7162
DFformer 0.5841 0.4455 0.5837 0.7618 0.5208 0.7552 0.7546 0.6323 0.7433
Proposed 0.6231 0.4971 0.6143 0.7467 0.4925 0.7416 0.7561 0.6343 0.7443

2. Experimental Results in Three Fine-tuning Strategies

Model BCIC IV–2a (Brunner et al. 2008) BCIC IV–2b (Leeb et al. 2008) Zhou (Zhou et al. 2016)
Acc Kappa F1–score Acc Kappa F1–score Acc Kappa F1–score
Baseline 0.6231 0.4971 0.6143 0.7467 0.4933 0.7418 0.7561 0.6343 0.7443
Fine–tuning Strategy BCIC IV–2b → BCIC IV–2a BCIC IV–2a → BCIC IV–2b BCIC IV–2a → Zhou
+ DSH 0.2635 0.0183 0.1463 0.7358 0.4718 0.7165 0.6882 0.5339 0.6315
+ DSH + Classification head 0.5214 0.3613 0.5062 0.7609 0.5213 0.7567 0.6952 0.5428 0.6815
Full fine–tuning 0.6175 0.4896 0.6098 0.7540 0.5072 0.7488 0.7538 0.6308 0.7416
Fine–tuning Strategy Zhou → BCIC IV–2a Zhou → BCIC IV–2b BCIC IV–2b → Zhou
+ DSH 0.3532 0.1370 0.3061 0.6890 0.3770 0.6838 0.6472 0.4826 0.4960
+ DSH + Classification head 0.4959 0.3274 0.4809 0.7222 0.4432 0.7155 0.6970 0.5455 0.6883
Full fine–tuning 0.6055 0.4736 0.5979 0.7496 0.4984 0.7437 0.7191 0.5789 0.7066

3. Experiments

3.1 Visualization of spatial representations learned by the proposed DSH

  • (a) Class-wise spatial weights for four MI tasks across four representative indices of the DSS, #7, #19, #39, and #40
  • (b) Temporal dynamics of applying spatial weights across a randomly selected index of the DSS, #19

Alt text

3.2 Comparison of the attention entropy-based analysis and the visualization of the attention maps between the EEGMixer and the DFformer

  • Distribution of the attention entropy across (a) spatial and (b) temporal domains, respectively. Visualization of the attention maps from (c) the EEGMixer and (d) the DFformer

Alt text

3.3 Visualization of the domain-wise specialization in the proposed ODI

  • (a) Class-conditional temporal contribution of the temporal experts in the MOTE across different MI tasks
  • (b) Class-wise spatial contribution of the spatial experts in the MOSE across EEG channels
  • All visualizations are extracted from the output of Block #1 in the EEGMixer

Alt text

3.4 Expected calibration error (ECE) across different MI tasks and fine-tuning strategies under two cross-dataset settings

Model Left Right Feet Tongue Avg.
Baseline 0.0800 0.1230 0.0900 0.0540 0.0868
Fine–tuning Strategy BCIC IV–2b → BCIC IV–2a
+ DSH 0.0780 0.2480 0.1470 0.1040 0.1444
+ DSH + Classification head 0.0780 0.0750 0.0630 0.0400 0.0640
Full fine–tuning 0.0800 0.1140 0.0710 0.0430 0.0771
Fine–tuning Strategy Zhou → BCIC IV–2a
+ DSH 0.0770 0.0750 0.1570 0.1390 0.1119
+ DSH + Classification head 0.0630 0.0440 0.0930 0.1660 0.0916
Full fine–tuning 0.0820 0.1150 0.0670 0.0280 0.0728