Public repository for Gmail interface papers using uhd-EEG
Author: Motoshige Sato1, Yasuo Kabe1, Sensho Nobe1, Akito Yoshida1, Masakazu Inoue1, Mayumi Shimizu1, Kenichi Tomeoka1, Shuntaro Sasai1*
1Araya Inc.
The dataset is hosted on OpenNeuro (ds007591). 128-channel EEG recorded during overt, minimally overt, and covert speech production of 5 color words (green, magenta, orange, violet, yellow).
-
Install the OpenNeuro CLI:
npm install -g @openneuro/cli
Or using Deno (recommended):
deno install -Agf jsr:@openneuro/cli
-
Download the dataset into
data/:openneuro download ds007591 data
-
Extract per-trial epochs from the BIDS EDF files:
uv run python bids/extract_from_bids.py
This creates the per-trial npy files, word lists, and metadata that the preprocessing and training pipelines expect.
import mne
import numpy as np
import pandas as pd
# --- Load a single run from BIDS ---
edf_path = "data/sub-1/ses-20230511/eeg/sub-1_ses-20230511_task-minimallyovert_acq-calibration_run-01_eeg.edf"
events_path = edf_path.replace("_eeg.edf", "_events.tsv")
raw = mne.io.read_raw_edf(edf_path, preload=True, verbose=False)
events_df = pd.read_csv(events_path, sep="\t")
data = raw.get_data() # (139 channels, n_samples) in Volts
print(f"Channels: {data.shape[0]}, Samples: {data.shape[1]}, Sfreq: {raw.info['sfreq']} Hz")
print(f"Trials: {len(events_df)}")
print(events_df.head())
# --- Extract trial 0 ---
SFREQ = 256
N_CH_TOTAL = 139
EPOCH_SAMPLES = 2880 # 6.25 sec * 256 Hz + margin (packets of 8)
trigger = data[N_CH_TOTAL - 1]
onsets = np.where(np.diff(trigger) > 0.5)[0] + 1
onset_sample = onsets[0]
start = (onset_sample // 8 - 359) * 8
epoch = data[:, start:start + EPOCH_SAMPLES] # (139, 2880) in Volts
# --- Split into 5 repetitions and average ---
DURA_UNIT = 1.25 # seconds per repetition
samples_per_rep = int(DURA_UNIT * SFREQ) # 320 samples
eeg_128 = epoch[:128] # EEG channels only
# Extract the last 1600 samples (5 reps x 320 samples)
eeg_trial = eeg_128[:, -samples_per_rep * 5:] # (128, 1600)
reps = eeg_trial.reshape(128, 5, samples_per_rep) # (128, 5, 320)
trial_avg = reps.mean(axis=1) # (128, 320) - trial-averaged EEG
# --- Show label info ---
WORD_LABELS = {0: "green", 1: "magenta", 2: "orange", 3: "violet", 4: "yellow"}
label = events_df.iloc[0]["value"]
print(f"\nTrial 0: label={label}, color={WORD_LABELS[label]}")
print(f"Trial-averaged EEG shape: {trial_avg.shape}") # (128, 320)
print(f" Mean amplitude: {trial_avg.mean():.6e} V")
print(f" Std amplitude: {trial_avg.std():.6e} V")- Install requirements:
uv sync
- Save preprocessed EEG/EMG:
uv run python plot_figures/make_preproc_files.py
- Visualization of preprocessing pipeline (Fig. 1):
uv run python plot_figures/plot_preprocesssing.py
- Visualization of volume of speech (Fig. 1) and RMS of EMGs (Fig. 2):
uv run python plot_figures/plot_rms.py
- Quantify the contamination level of EMG to EEG (mutual information, Fig. 2):
uv run python plot_figures/plot_mis.py
- Train decoders. You can specify in
parallel_setswhich subjects and which sessions' data to train:uv run python uhd_eeg/trainers/trainer.py -m hydra/launcher=joblib parallel_sets=subject1-1,subject1-2,subject1-3
- Copy the trained models and metrics to
data/ - Run the inference for online data and evaluate metrics (Table 1, 2, Fig. S1):
uv run python plot_figures/evaluate_accs.py
- Visualization of electrodes used when hypothetically reducing electrode density (Fig. S1):
uv run python plot_figures/show_montage_decimation.py
- Analysis on decoding contributions (integrated gradients, Fig.3-5, Fig.S2):
uv run python plot_figures/plot_contribution.py
Internal scripts (BIDS conversion, integration tests) require access to raw data on a NAS.
Configure paths by creating a .env file in the project root:
# .env
PARTICIPAT_MAPPING_PATH=/path/to/participant_mapping_gmail.json
RAW_ROOT=/path/to/nas/raw_data
BIDS_ROOT=/path/to/nas/bids_output
OPENNEURO_API_KEY=your_api_keyRAW_ROOT: Root directory of the raw EEG data on NAS (contains subject directories)BIDS_ROOT: Output directory for BIDS conversion on NASPARTICIPAT_MAPPING_PATH: Path to the participant name→BIDS ID mapping JSONOPENNEURO_API_KEY: API key for uploading to OpenNeuro (get one here)
