This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
toile is a Python package for working with astrocyte dynamics data. It processes TIFF image stacks (particularly OME-TIFF format) from microscopy recordings and exports them to WebDataset format for machine learning workflows.
This project uses uv for dependency management:
# Install dependencies (including dev dependencies)
uv sync --all-extras --dev
# Install package in editable mode
uv pip install -e .# Run all tests
uv run pytest
# Run tests with coverage
uv run pytest --cov# Build distribution
uv build
# Publish to PyPI (requires UV_PUBLISH_TOKEN)
uv publish# As a module
python -m toile
# Using the installed command
toile
# Export TIFF frames to WebDataset format
toile export frames <input> <output> [--stem <name>] [--uint8] [--verbose]-
TIFF Import (
tiff_import.py): Loads TIFF stacks from microscopy recordings- Supports OME-TIFF metadata extraction using
tifffileandxmltodict - Parses frame-level metadata (position, timing, UUIDs)
- Handles multi-channel recordings
- Optional uint8 normalization for ML pipelines
- Supports OME-TIFF metadata extraction using
-
Schema Definitions (
schema.py): Data structures built onatdata(a PackableSample framework)Movie: Container for full TIFF stacks with metadataFrame: Individual image frames with metadataSliceRecordingFrame: Experimental session frames with mouse/slice identifiersImageSample: Simplified image data for ML- Uses
atdata.lensfor data transformations between representations
-
Export Pipeline (
export.py): Converts TIFF data to WebDataset format- Processes individual TIFF files or batch configs (YAML)
- Writes to sharded tar archives using
webdatasetlibrary - Configurable shard sizes (default 850MB, or 38MB for PDS/Bluesky compatibility)
- Supports glob patterns for batch processing
The CLI uses Typer for command routing:
- Main app in
__init__.pywith subcommands exportsubcommand group inexport.pyexport frames: Convert TIFFs to per-frame WebDataset archivesexport test-frames: Generate synthetic test datasets
atdata: Data structure framework for packable sampleswebdataset: Efficient dataset format for ML pipelinesscikit-image&tifffile: TIFF file handlingtyper: CLI frameworkxmltodict: OME metadata parsing
The tiff_import.py module includes a flexible filename parsing system:
_make_filename_parser(): Creates custom parsers from template + transforms- Built-in transforms:
identity,float,int,split_age_sex,date_compact - Used to extract experimental metadata from file naming conventions
OME-TIFF metadata is extracted and normalized:
_collate_frame_metadata(): Per-frame position, timing, UUID data_collate_metadata(): Image-level acquisition date, physical scales, channel info- Metadata flows through
Movie→Frame→ WebDataset samples
Tests are located in the tests/ directory (currently empty). When adding tests:
- Use pytest as the test framework
- Test coverage is enabled with pytest-cov
- CI runs tests on push to main and release branches via GitHub Actions