EEG-based authentication project for UCSD COGS 189. We train a convolutional model on the PhysioNet EEG Motor Movement/Imagery dataset and evaluate a cosine-similarity authenticator on unseen subjects.
- PhysioNet EEG Motor Movement/Imagery: 109 participants, 64 channels, 160 Hz sampling (EDF files in
files/asS###/S###R##.edf). - Channels used:
Oz,T7,Cz(asOz..,T7..,Cz..in the EDF headers) to keep the pipeline lightweight while retaining discriminative power identified in prior work. - Sliding-window augmentation: 1 s windows (
T=160) with stride 4 samples; groups of 30 windows (Gamma=30) taken every 8 windows (Delta=8) to form training samples.
- Preprocessing: Load EDF with
mne, pick the 3 channels, MinMax-scale per channel, create overlapping windows and grouped segments as model inputs. - Training: CNN with three conv blocks → flatten → 1024-d dense → dropout (0.5) → 90-way classifier. Trained on subjects 1–90 with an 80/20 stratified split, batch size 32, Adam (
lr=1e-4), cross-entropy for 20 epochs. - Embeddings: After training, register a forward hook on the dropout layer to extract L2-normalized 1024-d embeddings for each sample.
- Enrollment/Authentication: Subjects 91–109 are treated as unseen. For each subject, 77% of samples build an averaged enrollment fingerprint; the remaining 23% form authentication queries. Classification uses cosine distance with a decision threshold of 0.01 (earlier tests used 0.275 from Bidgoly et al. 2022).
- Visualization: Accuracy/F1 plus confusion matrix via
matplotlib/seaborn.
- Classification (subjects 1–90): ~96.85% train accuracy, ~94.61% test accuracy with only 3 channels.
- Authentication (subjects 91–109): Precision/recall/F1 of 1.00 in the current split, but results may be optimistic because enrollment and authentication use segments from the same recordings—additional sessions would better test generalization.
- Threshold is tunable; lowering it to 0.01 preserved accuracy while tightening acceptance.