IBL Neuromodulators

Fiber photometry analysis pipeline for the IBL neuromodulators project. Ingests session metadata and raw signals from Alyx/ONE, applies QC, preprocesses photometry signals, and extracts peri-event neural responses.

Setup

This project requires the IBL unified environment with specific development branches of ibllib and ibl-photometry.

Fresh install

# 1. Create and activate a virtual environment
uv venv .venv --prompt ibl --python 3.13
source .venv/bin/activate

# 2. Install IBL packages from the required branches
uv pip install git+https://github.com/int-brain-lab/ibllib@photometry-integration
uv pip install git+https://github.com/int-brain-lab/ibl-photometry@develop

# 3. Install iblnm (editable) with dev tools
uv pip install -e ".[dev]"

Existing IBL environment

If you already have the IBL environment with the correct ibllib and ibl-photometry branches:

uv pip install -e ".[dev]"

This installs iblnm plus any missing core dependencies (xarray, statsmodels, cca-zoo, etc.) and dev tools (pytest, ruff).

Verify

pytest                     # run tests
ruff check .               # lint

Docs: IBL · ONE API

Pipeline

Scripts run in order. Each produces a parquet error log alongside its outputs.

query_database.py → photometry.py → task.py → wheel.py → dataset_overview.py
       ↓                  ↓             ↓          ↓              ↓
  sessions.pqt      {eid}.h5 +    performance  {eid}.h5      figures +
                  qc_photometry      .pqt     (wheel group)  errors.pqt

Run the full pipeline or resume from a specific stage:

python scripts/run_pipeline.py                    # all stages
python scripts/run_pipeline.py --from photometry  # resume from photometry
python scripts/run_pipeline.py --only task        # single stage
python scripts/run_pipeline.py --skip-errors      # continue past failures

Stage 1: `query_database.py` — Session metadata

Queries the ibl_fibrephotometry project on Alyx, enriches each session with subject info (strain, line, neuromodulator), brain regions, hemisphere, and dataset availability. Validates all metadata fields.

python scripts/query_database.py                 # incremental update
python scripts/query_database.py --redownload    # re-download everything
python scripts/query_database.py --extended-qc   # also fetch Alyx extended QC

Output: metadata/sessions.pqt, metadata/query_database_log.pqt

Stage 2: `photometry.py` — QC, preprocessing, response extraction

Processes each session through a tiered pipeline:

Load trials and photometry from ONE
Validate that trials fall within the photometry recording window (fatal)
Raw QC: check for band inversions and early samples (fatal)
Sliding QC: compute signal quality metrics (fatal)
Preprocess: bleach correction → isosbestic regression → z-score → resample to 30 Hz
Extract peri-event responses for stimOn_times, firstMovement_times, feedback_times
Save signal and responses to HDF5

Output: data/sessions/{eid}.h5, data/qc_photometry.pqt, metadata/photometry_log.pqt

Stage 3: `task.py` — Task performance

Computes per-session metrics: fraction correct, no-go fraction, psychometric function parameters (bias, threshold, lapses) for the 50/50 block, and per-block psychometrics and bias shift for biased/ephys sessions.

Output: data/performance.pqt, metadata/task_log.pqt

Stage 4: `wheel.py` — Per-trial wheel velocity

Extracts wheel velocity for each trial (stimOn → feedback), NaN-padded to the longest trial. Appends a wheel/ group to existing HDF5 files.

Output: appended data/sessions/{eid}.h5, metadata/wheel_log.pqt

Stage 5: `dataset_overview.py` — Session coverage figures

Joins sessions.pqt, qc_photometry.pqt, performance.pqt, and all error logs. Produces session-by-session overview matrices at each processing stage, plus barplots of complete recordings per brain target and per mouse.

Output: figures/dataset_overview/

Analysis scripts

Script	Purpose
`responses.py`	Trial-level response magnitudes, LMM fits, response feature vectors, similarity, and decoding
`movement_encoding.py`	LOSO cross-validated model comparison of contrast vs. timing predictors, per-contrast slopes
`task_encoding.py`	Per-session GLM encoding decomposed via PCA/ICA, per-cohort CCA
`task_performance.py`	Learning curves, psychometric trajectories per target
`qc_overview.py`	QC metric distributions (histograms, violins, PCA, temporal trends)
`video.py`	Video QC pipeline (timestamps, dropped frames, pin state)
`session_viewer.py`	Interactive single-session viewer (raw + preprocessed + PSTHs)
`example_session.py`	Annotated example of loading and plotting a session

PhotometrySession

PhotometrySession wraps a row from sessions.pqt and provides methods for loading, validating, preprocessing, and extracting responses. It extends PhotometrySessionLoader from brainbox.io.one.

Data attributes are lazy-loaded: trials, photometry, responses, qc, and wheel_velocity start empty and are populated by explicit method calls.

Loading from ONE

import pandas as pd
from one.api import ONE
from iblnm.config import SESSIONS_FPATH
from iblnm.data import PhotometrySession

one = ONE()
df_sessions = pd.read_parquet(SESSIONS_FPATH)
session_row = df_sessions.iloc[0]

ps = PhotometrySession(session_row, one=one)
ps.load_trials()      # → ps.trials (DataFrame with signed_contrast, contrast added)
ps.load_photometry()  # → ps.photometry dict: {'GCaMP': ..., 'Isosbestic': ...}

Loading from HDF5

If the pipeline has already run, load preprocessed data from disk:

from iblnm.config import SESSIONS_H5_DIR

ps = PhotometrySession(session_row, one=one)
ps.load_h5(SESSIONS_H5_DIR / f'{ps.eid}.h5')
# → ps.photometry['GCaMP_preprocessed'], ps.trials, ps.responses, ps.wheel_velocity

Validation

Each method raises a typed exception on failure. In scripts, pass an exlog list to log errors instead of raising (see Error Handling below).

ps.validate_trials_in_photometry_time()  # raises TrialsNotInPhotometryTime
ps.validate_n_trials()                   # raises InsufficientTrials
ps.validate_event_completeness()         # raises IncompleteEventTimes
ps.validate_block_structure()            # raises BlockStructureBug

QC

ps.run_raw_qc()                    # n_band_inversions, n_early_samples → ps.qc
ps.validate_qc()                   # raises BandInversion or EarlySamples
ps.run_sliding_qc()                # sliding-window signal quality metrics → ps.qc
ps.validate_few_unique_samples()   # raises FewUniqueSamples (non-fatal)

After QC, ps.qc is a DataFrame with one row per (brain_region, band).

Preprocessing and response extraction

from iblnm.config import RESPONSE_EVENTS

ps.preprocess()  # bleach → isosbestic → zscore → resample to 30 Hz
                 # → ps.photometry['GCaMP_preprocessed']

ps.extract_responses(events=RESPONSE_EVENTS)
# → ps.responses: dict[str, xr.DataArray] keyed by brain region,
#   each DataArray has dims (event, trial, time)

ps.save_h5()  # saves all available data groups

Working with responses

Response transforms operate on a single region's DataArray at a time:

region_responses = ps.responses['VTA']  # dims: (event, trial, time)

# Baseline subtraction (mean of [-0.1, 0] window)
responses = ps.subtract_baseline(region_responses)

# Mask time points after the next event in a trial sequence
responses = ps.mask_subsequent_events(
    region_responses,
    event_order=['stimOn_times', 'firstMovement_times', 'feedback_times']
)

Task performance

perf = ps.basic_performance()
# {'fraction_correct': 0.81, 'fraction_correct_easy': 0.94, 'nogo_fraction': 0.02,
#  'psych_50_bias': -1.2, 'psych_50_threshold': 8.4, ...}

block_perf = ps.block_performance()   # per-block psychometrics (biased/ephys only)
fit = ps.fit_psychometric()           # {bias, threshold, lapse_left, lapse_right, r_squared, n_trials}

PhotometrySessionGroup

PhotometrySessionGroup is the central class for all multi-session analyses. It manages session-level filtering and recording-level explosion internally.

Design principles

Constructor takes session-level DataFrames. List columns (brain_region, hemisphere, target_NM) are kept intact. Explosion to one-row-per-recording happens via explode_recordings().
filter_sessions filters at the session level by session type, excluded subjects, QC error types, and target-NM values. Sessions where none of their target_NM entries match are dropped.
explode_recordings produces recording-level rows from the filtered sessions, trimming to only valid target_NM entries and adding fiber_idx.
from_catalog handles the full pipeline: load parquet, validate parallel lists, filter sessions, explode recordings.
Lazy analysis attributes. events, response_features, similarity_matrix, and decoder start as None and are populated by explicit method calls.
Iterable. for rec, ps in group yields (recording_row, PhotometrySession) pairs. Sessions are cached by eid so loading an H5 once serves all regions.

Usage

from iblnm.config import SESSIONS_FPATH
from iblnm.data import PhotometrySessionGroup
from iblnm.io import _get_default_connection

one = _get_default_connection()

# Load, filter, and explode in one step
group = PhotometrySessionGroup.from_catalog(
    SESSIONS_FPATH, one=one,
    session_types=('biased', 'ephys'),
)

Analysis methods

# Trial-level response magnitudes (one row per recording × event × trial)
group.get_response_magnitudes()
# → group.response_magnitudes (DataFrame)

# Response feature vectors (one row per recording, columns = condition labels)
group.get_response_features(nan_handling='drop_features')
# → group.response_features (DataFrame indexed by (eid, target_NM))

# Pairwise cosine similarity between recordings
group.response_similarity_matrix()
# → group.similarity_matrix (DataFrame)

# Decode target-NM from response vectors (logistic regression with LOSO CV)
group.decode_target()
# → group.decoder (TargetNMDecoder with .accuracy, .confusion, .coefficients, .contributions)

Filtering and subsetting

# Standard filters (all parameters optional, default to config values)
group.filter_sessions(
    session_types=('biased', 'ephys'),
    exclude_subjects=['excluded_mouse'],
    qc_blockers={'MissingRawData', 'QCValidationError'},
    targetnms=['VTA-DA', 'DR-5HT'],
)

# Boolean mask
group.filter(group.recordings['NM'] == 'DA')

# Indexing
rec, ps = group[0]  # first recording

Iteration

for rec, ps in group:
    # rec: pd.Series (recording metadata)
    # ps: PhotometrySession (cached by eid, loads H5 on first access)
    ps.load_h5(h5_path)
    ...

Error Handling

The @exception_logger decorator is the central pattern for batch processing. Functions decorated with it accept an optional exlog parameter:

Without exlog: exceptions propagate normally (used in tests)
With exlog=[]: exceptions are caught, logged as dicts, and the original row is returned so the pipeline continues

from iblnm.validation import exception_logger, InvalidBrainRegion

@exception_logger
def validate_brain_region(session):
    ...
    raise InvalidBrainRegion(...)

# In scripts — errors logged, pipeline continues:
error_log = []
df = df.apply(validate_brain_region, axis='columns', exlog=error_log)

# In tests — errors raised:
with pytest.raises(InvalidBrainRegion):
    validate_brain_region(bad_session)

Error log entries follow the schema: ['eid', 'error_type', 'error_message', 'traceback'].

Downstream scripts read upstream error logs via collect_session_errors() and filter sessions based on which error types are present, rather than re-validating.

Data Files

`metadata/sessions.pqt` — one row per session

Column	Type	Description
`eid`	str	Alyx session UUID
`subject`	str	Mouse name
`start_time`	str	ISO 8601 session start
`session_type`	str	training / biased / ephys / habituation / histology
`NM`	str	Neuromodulator: DA, 5HT, NE, ACh
`brain_region`	list[str]	Recording targets, e.g. `['VTA', 'SNc']`
`hemisphere`	list[str]	Hemisphere per region, e.g. `['l', 'r']`
`target_NM`	list[str]	Combined labels, e.g. `['VTA-DA', 'SNc-DA']`
`lab`	str	Recording lab
`day_n`	int	Days since subject's first session
`session_n`	float	Session index (dense rank within subject)
`session_length`	float	Duration in seconds
`strain`, `line`, `genotype`	str	Mouse genetics
`datasets`	list[str]	ALF dataset paths available on ONE

brain_region, hemisphere, and target_NM are parallel lists that must always have matching lengths. To get one row per recording, explode all three together: df.explode(['brain_region', 'hemisphere', 'target_NM']).

`results/responses/responses.pqt` — one row per (recording x event x trial)

Column	Type	Description
`eid`	str	Session UUID
`subject`	str	Mouse name
`session_type`	str	biased / ephys
`NM`	str	Neuromodulator
`target_NM`	str	Target-NM label
`brain_region`	str	Recording target
`hemisphere`	str	l / r
`event`	str	stimOn_times / firstMovement_times / feedback_times
`trial`	int	Trial index
`signed_contrast`	float	Signed stimulus contrast
`contrast`	float	Unsigned stimulus contrast
`choice`	float	-1 left / 0 no-go / 1 right
`feedbackType`	float	1 reward / -1 punishment
`probabilityLeft`	float	Block probability
`stim_side`	str	left / right
`reaction_time`	float	firstMovement - stimOn (seconds)
`response_early`	float	Mean response in early window (0.1-0.35s)

`results/responses/response_matrix.pqt` — one row per recording

Response feature vectors indexed by (eid, target_NM). Each column is a condition label encoding event x contrast x laterality x feedback (e.g. stimOn_c1_contra_correct). Values are mean response magnitudes in the early window.

`data/qc_photometry.pqt` — one row per (session, brain region, band)

Column	Type	Description
`eid`	str	Session UUID
`brain_region`	str	Single recording target
`band`	str	GCaMP or Isosbestic
`n_unique_samples`	float	Fraction of unique values (< 0.05 flagged)
`n_band_inversions`	int	Samples where GCaMP < Isosbestic (> 0 fatal)
`n_early_samples`	int	Samples before recording start (> 0 fatal)
`ar_score`	float	AR(1) autocorrelation coefficient
`median_absolute_deviance`	float	MAD of signal
`percentile_asymmetry`	float	(p75-p50) / (p50-p25) skewness proxy
`percentile_distance`	float	(p75-p25) / median spread proxy
`bleaching_tau`	float	Photobleaching time constant in seconds (GCaMP only)
`iso_correlation`	float	R² between GCaMP and Isosbestic (GCaMP only)

`data/performance.pqt` — one row per session

Column	Type	Description
`eid`	str	Session UUID
`n_trials`	int	Total trial count
`fraction_correct`	float	Overall fraction correct
`fraction_correct_easy`	float	Fraction correct on 100% contrast
`nogo_fraction`	float	Fraction of no-go trials
`psych_50_{param}`	float	Psychometric fit on 50/50 block (bias, threshold, lapse_left, lapse_right, r_squared)
`psych_80_{param}`	float	80% left block fit (biased/ephys only)
`psych_20_{param}`	float	20% left block fit (biased/ephys only)
`bias_shift`	float	psych_80_bias - psych_20_bias

HDF5: `data/sessions/{eid}.h5`

File is organized into top-level groups (metadata, errors, photometry, trials, wheel) that mirror the PhotometrySession attributes. Each group is read/written by a dedicated handler pair registered in _SAVE_HANDLERS and _LOAD_HANDLERS in iblnm/data.py. Photometry data is organized per brain region so single-region loads do not require reading the full file.

{eid}.h5
├── metadata/
│   ├── attrs: eid, subject, start_time, number, task_protocol,
│   │          session_type, lab, NM, strain, line, end_time,
│   │          session_length, day_n, session_n, url
│   └── datasets (list-valued fields):
│       genotype, projects, users, brain_region, hemisphere,
│       target_NM, datasets
│
├── errors/
│   ├── eid               str[M]
│   ├── error_type        str[M]
│   ├── error_message     str[M]
│   └── traceback         str[M]
│
├── photometry/
│   └── {brain_region}/
│       ├── preprocessed/
│       │   ├── times     float64 (N,)    sample times at 30 Hz
│       │   ├── signal    float64 (N,)    z-scored, isosbestic-corrected GCaMP
│       │   └── attrs: fs=30
│       │
│       ├── responses/
│       │   ├── times                float64 (W,)     time relative to event
│       │   ├── trials               int64   (T,)     trial indices
│       │   ├── stimOn_times         float64 (T, W)
│       │   ├── firstMovement_times  float64 (T, W)
│       │   ├── feedback_times       float64 (T, W)
│       │   └── attrs: fs=30, response_window=[-1.0, 1.0]
│       │
│       └── qc/
│           └── one dataset per QC metric column (band, brain_region,
│               n_unique_samples, ar_score, bleaching_tau, ...)
│
├── trials/
│   ├── stimOn_times          float64 (T,)
│   ├── firstMovement_times   float64 (T,)
│   ├── feedback_times        float64 (T,)
│   ├── response_times        float64 (T,)
│   ├── choice                float64 (T,)   -1 left, 0 no-go, 1 right
│   ├── feedbackType          float64 (T,)   1 reward, -1 punishment
│   ├── probabilityLeft       float64 (T,)   0.2, 0.5, or 0.8
│   ├── signed_contrast       float64 (T,)   negative = left stimulus
│   ├── contrast              float64 (T,)   unsigned
│   └── stim_side             str     (T,)   'left' or 'right'
│
└── wheel/
    └── responses/
        ├── velocity   float32 (T, W)  per-trial wheel velocity; NaN-padded
        └── attrs: fs=100, t0_event='stimOn_times', t1_event='feedback_times'

N = samples at 30 Hz, T = trial count, W = response window samples (60 for [-1, 1] s at 30 Hz), M = logged error count.

save_h5(groups=...) and load_h5(groups=...) accept the top-level group names ('metadata', 'errors', 'photometry', 'trials', 'wheel') to restrict which handlers run. Omit groups to process everything present.

Project Structure

iblnm/                      # Core package (generic, reusable)
  config.py                 # Paths, constants, QC thresholds, color mappings
  data.py                   # PhotometrySession, PhotometrySessionGroup
  io.py                     # Alyx/ONE queries
  task.py                   # Task performance (psychometrics, block validation)
  analysis.py               # Signal processing, LMMs, similarity, decoding
  validation.py             # Custom exceptions, @exception_logger, validate_* functions
  util.py                   # Pandas utilities, parquet I/O, schema enforcement
  vis.py                    # Plotting functions
  gui.py                    # Interactive session viewer widget

scripts/                    # Pipeline stages and analysis scripts
tests/                      # pytest (synthetic fixtures, no Alyx calls)

# Generated outputs (gitignored)
metadata/                   # sessions.pqt, error logs (fibers.csv is tracked)
data/                       # qc_photometry.pqt, performance.pqt, sessions/*.h5
results/                    # Analysis outputs (responses/, movement_encoding/, task_encoding/)
figures/                    # Output plots
specs/                      # Design specs (gitignored, local working docs)

Name		Name	Last commit message	Last commit date
Latest commit History 237 Commits
iblnm		iblnm
metadata		metadata
scripts		scripts
tests		tests
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
README.md		README.md
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

IBL Neuromodulators

Setup

Fresh install

Existing IBL environment

Verify

Pipeline

Stage 1: query_database.py — Session metadata

Stage 2: photometry.py — QC, preprocessing, response extraction

Stage 3: task.py — Task performance

Stage 4: wheel.py — Per-trial wheel velocity

Stage 5: dataset_overview.py — Session coverage figures

Analysis scripts

PhotometrySession

Loading from ONE

Loading from HDF5

Validation

QC

Preprocessing and response extraction

Working with responses

Task performance

PhotometrySessionGroup

Design principles

Usage

Analysis methods

Filtering and subsetting

Iteration

Error Handling

Data Files

metadata/sessions.pqt — one row per session

results/responses/responses.pqt — one row per (recording x event x trial)

results/responses/response_matrix.pqt — one row per recording

data/qc_photometry.pqt — one row per (session, brain region, band)

data/performance.pqt — one row per session

HDF5: data/sessions/{eid}.h5

Project Structure

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Stage 1: `query_database.py` — Session metadata

Stage 2: `photometry.py` — QC, preprocessing, response extraction

Stage 3: `task.py` — Task performance

Stage 4: `wheel.py` — Per-trial wheel velocity

Stage 5: `dataset_overview.py` — Session coverage figures

`metadata/sessions.pqt` — one row per session

`results/responses/responses.pqt` — one row per (recording x event x trial)

`results/responses/response_matrix.pqt` — one row per recording

`data/qc_photometry.pqt` — one row per (session, brain region, band)

`data/performance.pqt` — one row per session

HDF5: `data/sessions/{eid}.h5`

Packages