Train Once, Transfer Anywhere: Toward Device-Homogeneous MI-EEG Decoding

This repository is the official implementations of EEGMixer in pytorch-lightning style:

D.-H. Lee, S.-J. Kim, H. Kong, and S.-W. Lee, "Train Once, Transfer Anywhere: Toward Device-Homogeneous MI-EEG Decoding," 2025. (Under Review)

Architecture

EEGMixer	EEGMixer Block

Abstract

Electroencephalogram (EEG) has emerged as a key modality, facilitating the development of brain-computer interface (BCI). Motor imagery (MI), one of the BCI paradigms, has garnered significant attention due to its dual role in motor rehabilitation and daily activity augmentation. Generalizing the decoding of MI-based EEG signals is essential for utilizing BCI systems in real-world environments. While transfer learning facilitates generalization by bridging structural differences across datasets, its deployment is hindered by the challenge of device heterogeneity. While recent studies have attempted to address this limitation, they often require additional preprocessing or dataset-specific architectural modifications. To address these, we propose the EEGMixer that eliminates the need for dataset-specific adaptation. The EEGMixer comprises three key innovations: i) the dynamic spatial hypernetwork for addressing the challenge of device heterogeneity by utilizing a temporally conditioned spatial weight, ii) the mosaic positional encoding that applies absolute and relative encodings along spatial and temporal domains to focus on domain-relevant information, and iii) the orchestration of domain information that extracts informative features by orchestrating EEG representations in spatial and temporal domains and subsequently integrates them to form a unified representation. The EEGMixer achieved competitive performances on each dataset and was extensively validated under six cross-dataset transfer settings across multiple datasets. These demonstrate that the EEGMixer is the first model to enable effective cross-dataset generalization without requiring dataset-specific architectural modifications. Notably, this is the first attempt to validate that a unified architecture can achieve the consistent transferability without the need for dataset-specific adaptation. Hence, we demonstrate the possibility of the EEGMixer to address the challenge of device heterogeneity and enable generalizable decoding across multiple datasets.

1. Performance Evalutions

Model	BCIC IV–2a (Brunner et al. 2008)			BCIC IV–2b (Leeb et al. 2008)			Zhou (Zhou et al. 2016)
Model	Acc	Kappa	F1-score	Acc	Kappa	F1-score	Acc	Kappa	F1-score
ShallowConvNet	0.5976	0.4630	0.5752	0.7558	0.5167	0.7475	0.6660	0.4998	0.6510
DeepConvNet	0.5756	0.4338	0.5640	0.7657	0.5235	0.7656	0.5135	0.2710	0.4911
EEGNet	0.6069	0.4755	0.5912	0.7457	0.5098	0.7457	0.6532	0.4806	0.6314
EEGConformer	0.5532	0.4039	0.5375	0.7391	0.4766	0.7333	0.7162	0.5910	0.7162
DFformer	0.5841	0.4455	0.5837	0.7618	0.5208	0.7552	0.7546	0.6323	0.7433
Proposed	0.6231	0.4971	0.6143	0.7467	0.4925	0.7416	0.7561	0.6343	0.7443

2. Experimental Results in Three Fine-tuning Strategies

Model	BCIC IV–2a (Brunner et al. 2008)			BCIC IV–2b (Leeb et al. 2008)			Zhou (Zhou et al. 2016)
Model	Acc	Kappa	F1–score	Acc	Kappa	F1–score	Acc	Kappa	F1–score
Baseline	0.6231	0.4971	0.6143	0.7467	0.4933	0.7418	0.7561	0.6343	0.7443
Fine–tuning Strategy	BCIC IV–2b → BCIC IV–2a			BCIC IV–2a → BCIC IV–2b			BCIC IV–2a → Zhou
+ DSH	0.2635	0.0183	0.1463	0.7358	0.4718	0.7165	0.6882	0.5339	0.6315
+ DSH + Classification head	0.5214	0.3613	0.5062	0.7609	0.5213	0.7567	0.6952	0.5428	0.6815
Full fine–tuning	0.6175	0.4896	0.6098	0.7540	0.5072	0.7488	0.7538	0.6308	0.7416
Fine–tuning Strategy	Zhou → BCIC IV–2a			Zhou → BCIC IV–2b			BCIC IV–2b → Zhou
+ DSH	0.3532	0.1370	0.3061	0.6890	0.3770	0.6838	0.6472	0.4826	0.4960
+ DSH + Classification head	0.4959	0.3274	0.4809	0.7222	0.4432	0.7155	0.6970	0.5455	0.6883
Full fine–tuning	0.6055	0.4736	0.5979	0.7496	0.4984	0.7437	0.7191	0.5789	0.7066

3. Experiments

3.1 Visualization of spatial representations learned by the proposed DSH

(a) Class-wise spatial weights for four MI tasks across four representative indices of the DSS, #7, #19, #39, and #40
(b) Temporal dynamics of applying spatial weights across a randomly selected index of the DSS, #19

3.2 Comparison of the attention entropy-based analysis and the visualization of the attention maps between the EEGMixer and the DFformer

Distribution of the attention entropy across (a) spatial and (b) temporal domains, respectively. Visualization of the attention maps from (c) the EEGMixer and (d) the DFformer

3.3 Visualization of the domain-wise specialization in the proposed ODI

(a) Class-conditional temporal contribution of the temporal experts in the MOTE across different MI tasks
(b) Class-wise spatial contribution of the spatial experts in the MOSE across EEG channels
All visualizations are extracted from the output of Block #1 in the EEGMixer

3.4 Expected calibration error (ECE) across different MI tasks and fine-tuning strategies under two cross-dataset settings

Model	Left	Right	Feet	Tongue	Avg.
Baseline	0.0800	0.1230	0.0900	0.0540	0.0868
Fine–tuning Strategy	BCIC IV–2b → BCIC IV–2a
+ DSH	0.0780	0.2480	0.1470	0.1040	0.1444
+ DSH + Classification head	0.0780	0.0750	0.0630	0.0400	0.0640
Full fine–tuning	0.0800	0.1140	0.0710	0.0430	0.0771
Fine–tuning Strategy	Zhou → BCIC IV–2a
+ DSH	0.0770	0.0750	0.1570	0.1390	0.1119
+ DSH + Classification head	0.0630	0.0440	0.0930	0.1660	0.0916
Full fine–tuning	0.0820	0.1150	0.0670	0.0280	0.0728

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
configs		configs
datasets		datasets
docker		docker
docs		docs
models		models
scripts		scripts
utils		utils
.gitignore		.gitignore
README.md		README.md
training.py		training.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Train Once, Transfer Anywhere: Toward Device-Homogeneous MI-EEG Decoding

Architecture

Abstract

1. Performance Evalutions

2. Experimental Results in Three Fine-tuning Strategies

3. Experiments

3.1 Visualization of spatial representations learned by the proposed DSH

3.2 Comparison of the attention entropy-based analysis and the visualization of the attention maps between the EEGMixer and the DFformer

3.3 Visualization of the domain-wise specialization in the proposed ODI

3.4 Expected calibration error (ECE) across different MI tasks and fine-tuning strategies under two cross-dataset settings

About

Uh oh!

Releases

Packages

Languages

comojin1994/eegmixer

Folders and files

Latest commit

History

Repository files navigation

Train Once, Transfer Anywhere: Toward Device-Homogeneous MI-EEG Decoding

Architecture

Abstract

1. Performance Evalutions

2. Experimental Results in Three Fine-tuning Strategies

3. Experiments

3.1 Visualization of spatial representations learned by the proposed DSH

3.2 Comparison of the attention entropy-based analysis and the visualization of the attention maps between the EEGMixer and the DFformer

3.3 Visualization of the domain-wise specialization in the proposed ODI

3.4 Expected calibration error (ECE) across different MI tasks and fine-tuning strategies under two cross-dataset settings

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages