Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ empty_log_process_temp.py

tests/**/bids/
tests/test_main_functionality/data/projects/test-project/sub-100
tests/data
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
Expand Down
12 changes: 6 additions & 6 deletions docs/bids_convert_and_upload.md
Original file line number Diff line number Diff line change
Expand Up @@ -91,15 +91,15 @@ If otherFilesUsed=True in project config file:

1. Behavioral files are copied via `_copy_behavioral_files()`.

- Validates required files against TOML config (`OtherFilesInfo`). In this config we add the the extensions of the expected other files. For example, in our testproject we use EyeList 1000 Plus eye tracker which generates .edf and .csv files. So we add these extensions as required other files. We also have mandatory labnotebook and participant info files in .tsv format.
- Renames files to include sub-XXX_ses-YYY_ prefix if missing.
- Deletes the other files in the project_other directory that are not listed in `OtherFilesInfo` in the project config file. It doesn"t delete from the source directory, only from out BIDS dataset.
- Validates required files against TOML config (`OtherFilesInfo`). In this config we add the the extensions of the expected other files. For example, in our testproject we use EyeList 1000 Plus eye tracker which generates .edf and .csv files. So we add these extensions as required other files. We also typically have mandatory labnotebook and participant info files in .tsv format.
- The `"*.src"="beh/{prefix}_target"` allows users to easily add BIDS-compatible custom data from the experiments. Note that `json` sidecars are not automatically generated yet.


2. Experimental files are copied via `_copy_experiment_files().`

- Gathers files from the experiment folder.
- Gathers files from the `<PROJECTS_OTHER>/experiment/` folder.
- Copies into BIDS `misc/` directory i.e. `<BIDS_ROOT>/misc/`
- Compresses into experiment.tar.gz.
- Compresses into `experiment.tar.gz`.
- Removes the uncompressed folder.

There is a flag in the `lslautobids run` command called `--redo_other_pc` which when specified, forces overwriting of existing other and experiment files in the BIDS dataset. This is useful if there are updates or corrections to the other/behavioral data that need to be reflected in the BIDS dataset.
Expand All @@ -121,7 +121,7 @@ This produces a clean, memory-efficient Raw object ready for BIDS conversion.
#### BIDS Validation (`validate_bids()`)
This function validates the generated BIDS files using the `bids-validator` package. It performs the following steps:
- Walks through the BIDS directory.
- Skips irrelevant files: (`.xdf`, `.tar.gz`, behavioral files, hidden/system files.)
- Skips irrelevant files already ignored in `.bidsignore` (`misc` folder, some hidden files)
- Uses `BIDSValidator` to validate relative paths.
- If any file fails validation, logs an error and returns 0 ; Otherwise, logs success and returns 1.

Expand Down
6 changes: 4 additions & 2 deletions docs/data_organization.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ Filename Convention for the raw data files :

## Project Other Folder

This folder contains the experimental and behavioral files which we also store in the dataverse. The folder structure is should as follows:
This folder contains the experimental and behavioral files which we also store in the dataverse. The folder structure has to be as follows:

projectname/
└── experiment
Expand All @@ -65,6 +65,7 @@ This folder contains the experimental and behavioral files which we also store i
└── beh
└── behavioral_files((lab notebook, CSV, EDF file, etc))

It is possible to modify the `src=target` syntax to "skip" folders via `..` (maybe we should simply allow `{prefix}` in the src as well => not yet implemented)
- **projectname** - any descriptive name for the project
- **experiment** - contains the experimental files for the project. Eg: showOther.m, showOther.py
- **data** - contains the behavioral files for the corresponding subject. Eg: experimentalParameters.csv, eyetrackingdata.edf, results.tsv.
Expand All @@ -91,8 +92,9 @@ This folder contains the converted BIDS data files and other files we want to ve
.........
└── beh
└──behavioral files (other files)
└── misc
└── misc (added to .bidsignore)
└── experimental files (This needs to stored in zip format)
└── labnotebook, subjectform etc.
└── sourcedata
└── raw xdf files
└── dataset_description.json
Expand Down
7 changes: 3 additions & 4 deletions docs/developers_documentation.md
Original file line number Diff line number Diff line change
Expand Up @@ -270,9 +270,8 @@ If otherFilesUsed=True in project config file:

1. Behavioral files are copied via `_copy_behavioral_files()`.

- Validates required files against TOML config (`OtherFilesInfo`). In this config we add the the extensions of the expected other files. For example, in our testproject we use EyeList 1000 Plus eye tracker which generates .edf and .csv files. So we add these extensions as required other files. We also have mandatory labnotebook and participant info files in .tsv format.
- Renames files to include sub-XXX_ses-YYY_ prefix if missing.
- Deletes the other files in the project_other directory that are not listed in `OtherFilesInfo` in the project config file. It doesn"t delete from the source directory, only from out BIDS dataset.
- Validates required files against TOML config (`OtherFilesInfo`). In this config we add the the extensions of the expected other files. For example, in our testproject we use EyeList 1000 Plus eye tracker which generates .edf and .csv files. So we add these extensions as required other files. We also typically use a mandatory labnotebook and participant info files in .tsv format. Currently it is not possible to convert files in this step, but should maybe become possible for e.g. `EDF` files and `CSV=>TSV` files
- follows the src=target regexp syntax to copy files over

2. Experimental files are copied via `_copy_experiment_files().`

Expand Down Expand Up @@ -300,7 +299,7 @@ This produces a clean, memory-efficient Raw object ready for BIDS conversion.
#### 5. BIDS Validation (`validate_bids()`)
This function validates the generated BIDS files using the `bids-validator` package. It performs the following steps:
- Walks through the BIDS directory.
- Skips irrelevant files: (`.xdf`, `.tar.gz`, behavioral files, hidden/system files.)
- Skips irrelevant files: (`misc`-folder, hidden/system files.)
- Uses `BIDSValidator` to validate relative paths.
- If any file fails validation, logs an error and returns 0 ; Otherwise, logs success and returns 1.

Expand Down
2 changes: 1 addition & 1 deletion docs/tutorial.md
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,7 @@ In this example, we will see how to use the LSLAutoBIDS package to:
otherFilesUsed = true

[OtherFilesInfo]
expectedOtherFiles = [".edf", ".csv", "_labnotebook.tsv", "_participantform.tsv"]
expectedOtherFiles = ["*.edf"="misc/{prefix}_et.edf", "*.csv"="misc/{prefix}_beh.csv", "*_labnotebook.tsv"="misc/{prefix}_labnotebook.tsv", "*_participantform.tsv"="{prefix}_participantform.tsv"]
```
2. Run the conversion and upload command to convert the `xdf` files to BIDS format and upload the data to the dataverse.
```
Expand Down
133 changes: 60 additions & 73 deletions lslautobids/convert_to_bids_and_upload.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
import os
import shutil
import sys
import re

from pyxdf import match_streaminfos, resolve_streams
from mnelab.io.xdf import read_raw_xdf
Expand Down Expand Up @@ -92,101 +93,89 @@ def copy_source_files_to_bids(self,xdf_file,subject_id,session_id,other, logger)

def _copy_behavioral_files(self, file_base, subject_id, session_id, logger):
"""
Copy behavioral files to the BIDS structure.
Copy behavioral files to the BIDS structure based on regex patterns.
Iterates through patterns and matches files, copying them directly to target locations.

Args:
file_base (str): Base name of the file (without extension).
subject_id (str): Subject ID.
session_id (str): Session ID.
logger: Logger instance.
"""

project_name = cli_args.project_name
logger.info("Copying the behavioral files to BIDS...")

# Get the TOML configuration
toml_path = os.path.join(project_root, cli_args.project_name, cli_args.project_name + '_config.toml')
data = read_toml_file(toml_path)
_expectedotherfiles = data["OtherFilesInfo"]["expectedOtherFiles"]

if not isinstance(_expectedotherfiles, dict):
raise ValueError("expectedOtherFiles must be a dictionary with regex patterns. List format is no longer supported since v0.2.0 .")

# get the source path
behavioural_path = os.path.join(project_other_root,project_name,'data', subject_id,session_id,'beh')
# get the destination path
dest_dir = os.path.join(bids_root , project_name, subject_id , session_id , 'beh')
#check if the directory exists
os.makedirs(dest_dir, exist_ok=True)

processed_files = []
behavioural_path = os.path.join(project_other_root, project_name, 'data', subject_id, session_id, 'beh')

if not os.path.exists(behavioural_path):
raise FileNotFoundError(f"Behavioral path does not exist: {behavioural_path} - did you forget to mount?")
return

# Extract the sub-xxx_ses-yyy part
def extract_prefix(filename):
parts = filename.split("_")
sub = next((p for p in parts if p.startswith("sub-")), None)
ses = next((p for p in parts if p.startswith("ses-")), None)
if sub and ses:
return f"{sub}_{ses}_"
return f"{sub}_{ses}"
return None

prefix = extract_prefix(file_base)

for file in os.listdir(behavioural_path):
# Skip non-files (like directories)
original_path = os.path.join(behavioural_path, file)
if not os.path.isfile(original_path):
continue

if not file.startswith(prefix):
logger.info(f"Renaming {file} to include prefix {prefix}")
renamed_file = prefix + file
else:
renamed_file = file
processed_files = []

processed_files.append(renamed_file)
dest_file = os.path.join(dest_dir, renamed_file)
# Get all files in source directory once
source_files = [f for f in os.listdir(behavioural_path)
if os.path.isfile(os.path.join(behavioural_path, f))]

# Iterate through patterns (not files)
for pattern, target_template in _expectedotherfiles.items():
compiled_regex = re.compile(pattern)

# Find matching files for this pattern
matched_files = [f for f in source_files if compiled_regex.match(f)]

if not matched_files:
raise FileExistsError(f"No files matched pattern '{pattern}' in {behavioural_path}")

if len(matched_files) > 1:
raise ValueError(f"Multiple files matched pattern '{pattern}': {matched_files}. Only one file per pattern is supported - manually intervention required")

# Process the first matching file
file = matched_files[0]
original_path = os.path.join(behavioural_path, file)

# Format the target path with prefix
target_path = target_template.format(prefix=prefix)
dest_file = os.path.join(bids_root, project_name, subject_id, session_id, target_path)

# Ensure destination directory exists
os.makedirs(os.path.dirname(dest_file), exist_ok=True)

# Track the relative path for checking
processed_files.append(target_path)

if cli_args.redo_other_pc:
logger.info(f"Copying (overwriting if needed) {file} to {dest_file}")
logger.info(f"Copying (overwriting) {file} to {target_path}")
shutil.copy(original_path, dest_file)
else:
if os.path.exists(dest_file):
logger.info(f"Behavioural file {file} already exists in BIDS. Skipping.")
logger.info(f"Behavioural file {target_path} already exists in BIDS. Skipping.")
else:
logger.info(f"Copying new file {file} to {dest_file}")
logger.info(f"Copying {file} to {target_path}")
shutil.copy(original_path, dest_file)



unnecessary_files = self._check_required_behavioral_files(processed_files, prefix, logger)

# remove the unnecessary files
for file in unnecessary_files:
file_path = os.path.join(dest_dir, file)
if os.path.exists(file_path):
logger.info(f"Removing unnecessary file: {file_path}")
os.remove(file_path)
else:
logger.warning(f"File to remove does not exist: {file_path}")



def _check_required_behavioral_files(self, files, prefix, logger):
"""
Check for required behavioral files after copying.

Args:
files (list): List of copied file names.
prefix (str): Expected prefix (e.g., "sub-001_ses-002_").
"""
logger.info("Checking for required behavioral files...")

# Get the expected file names from the toml file
toml_path = os.path.join(project_root, cli_args.project_name, cli_args.project_name + '_config.toml')
data = read_toml_file(toml_path)

required_files = data["OtherFilesInfo"]["expectedOtherFiles"]


for required_file in required_files:
if not any(f.startswith(prefix) and f.endswith(required_file) for f in files):
raise FileNotFoundError(f"Missing required behavioral file: {required_file}")

unnecessary_files = []
# remove everything except the required files
for file in files:
if not any(file.endswith(required_file) for required_file in required_files):
unnecessary_files.append(file)
return unnecessary_files
logger.info(f"Successfully processed {len(processed_files)} behavioral files")



def _copy_experiment_files(self, subject_id, session_id, logger):
Expand Down Expand Up @@ -350,8 +339,6 @@ def convert_to_bids(self, xdf_path,subject_id,session_id, run_id, task_id,other,
f.write('sourcedata\n')
# ignore the code folder - containing log files
f.write('code\n')
# ignore the beh folder in each sub-xxx/ses-yyys
f.write('**/beh\n')
# ignore the misc folder in each sub-xxx/ses-yyy
f.write('**/misc\n')
# ignore hidden files
Expand Down Expand Up @@ -387,19 +374,19 @@ def validate_bids(self,bids_path,subject_id,session_id, logger):
file_path = os.path.join(root, file)

# Skip non-relevant files
if file_path.endswith(".xdf") or file_path.endswith(".tar.gz") or 'beh' in file_path or file.startswith('.') or '.git' in file_path or os.path.basename(root).startswith('.'):
if 'misc' in file_path or file.startswith('.') or '.git' in file_path or os.path.basename(root).startswith('.'):
continue

if root == root_directory:
# Validate BIDS for files in the root directory
res = BIDSValidator().is_bids(file)
res = BIDSValidator().is_bids('/'+file)
else:
# Modify file path to be relative to the root directory
relative_path = os.path.relpath(file_path, root_directory)
res = BIDSValidator().is_bids('/'+relative_path)

if not res:
print(f"Validation failed for {file_path}")
logger.info(f"Validation failed for {file_path}")


file_paths.append(res)
Expand Down
14 changes: 13 additions & 1 deletion lslautobids/gen_project_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,19 @@

[OtherFilesInfo]
otherFilesUsed = true # Set to true if you want to include other (non-eeg-files) files (experiment files, other modalities like eye tracking) in the dataset, else false
expectedOtherFiles = [".edf", ".csv", "_labnotebook.tsv", "_participantform.tsv"] # List of expected other file extensions. Only the expected files will be copied to the beh folder in BIDS dataset. Give an empty list [] if you don't want any other files to be in the dataset. In this case only experiment files will be zipeed and copied to the misc folder in BIDS dataset.

# expectedOtherFiles: Dictionary format with regex patterns
# - The key is a regular expression to match source filenames in the project_other/.../beh/ folder
# - The value is a template path that includes {prefix} (e.g. sub-003_ses-002) and the target folder (beh/ or misc/)
# - Only files matching these patterns will be copied to the BIDS dataset
# the following is a sample configuration, you could also write it in short-hand notation: expectedOtherFiles={ ".*.edf"= "beh/{prefix}_physio.edf", ...}

[OtherFilesInfo.expectedOtherFiles]
".*.edf" = "beh/{prefix}_physio.edf"
".*.csv" = "beh/{prefix}_beh.tsv"
".*_labnotebook.tsv" = "misc/{prefix}_labnotebook.tsv"
".*_participantform.tsv" = "misc/{prefix}_participantform.tsv"


[FileSelection]
ignoreSubjects = ['sub-777'] # List of subjects to ignore during the conversion - Leave empty to include all subjects. Changing this value will not delete already existing subjects.
Expand Down
10 changes: 5 additions & 5 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
pyxdf
mne
mne-bids
bids_validator==1.13.1
datalad-dataverse==1.0.1
datalad-installer==1.0.3
pyDataverse==0.3.1
bids_validator>=1.13.1
datalad-dataverse>=1.0.1
datalad-installer>=1.0.3
pyDataverse>=0.3.1
requests>=2.12.0
jsonschema>=3.2.0
AnnexRemote@git+https://github.com/Lykos153/AnnexRemote.git@master#egg=AnnexRemote
Expand All @@ -13,4 +13,4 @@ pyyaml
mnelab
pybv
pytest
eeglabio
eeglabio
Loading