A processor for converting NWB (Neurodata Without Borders) files into chunked timeseries data for the Pennsieve platform.
This processor reads electrical series data from NWB files and:
- Extracts channel data with proper scaling (conversion factors, offsets)
- Writes chunked binary files (gzip-compressed, big-endian float64)
- Generates channel metadata files (JSON)
- Optionally uploads the processed data to Pennsieve via the import API
main.py - Entry point that orchestrates the processing pipeline.
reader.py - NWBElectricalSeriesReader reads NWB ElectricalSeries data,
handles timestamps and sampling rates, applies conversion factors and offsets,
and detects contiguous data chunks.
writer.py - TimeSeriesChunkWriter writes chunked binary data (.bin.gz)
and channel metadata (.metadata.json) in big-endian format.
importer.py - Creates import manifests via Pennsieve API and uploads files to S3 via presigned URLs.
clients/ - API clients for Pennsieve:
AuthenticationClient- AWS Cognito authenticationImportClient- Import manifest creation and file uploadTimeSeriesClient- Time series channel managementWorkflowClient- Analytic workflow instance managementBaseClient- Session management with auto-refresh
- Python 3.10+
- Docker (for local runs)
make venv
source venv/bin/activatemake installThis installs git hooks that automatically lint and format code on commit.
make pre-commitmake testmake test-covRuns ruff with auto-fix and formatting.
make lintConfigure the environment file
Edit dev.env with your settings:
ENVIRONMENT=local
INPUT_DIR=/data/input
OUTPUT_DIR=/data/output
CHUNK_SIZE_MB=1
IMPORTER_ENABLED=false
...Place your .nwb file in the data/input/ directory:
cp /path/to/your/file.nwb data/input/make runThis builds and runs the processor via Docker.
Output files will be written to data/output/.
Remove input/output files:
make cleanThe processor generates two types of files per channel:
- Pattern:
channel-{index}_{start_us}_{end_us}.bin.gz - Format: Gzip-compressed big-endian float64 values
- Example:
channel-00001_1000000_2000000.bin.gz
- Pattern:
channel-{index}.metadata.json - Contains: name, rate, start, end, unit, type, group, properties
| Variable | Description | Default |
|---|---|---|
ENVIRONMENT |
Runtime environment (local or production) |
local |
INPUT_DIR |
Directory containing NWB files | - |
OUTPUT_DIR |
Directory for output files | - |
CHUNK_SIZE_MB |
Size of each data chunk in MB | 1 |
IMPORTER_ENABLED |
Enable Pennsieve upload | false |
PENNSIEVE_API_KEY |
Pennsieve API key | - |
PENNSIEVE_API_SECRET |
Pennsieve API secret | - |
PENNSIEVE_API_HOST |
Pennsieve API endpoint | https://api.pennsieve.net |
PENNSIEVE_API_HOST2 |
Pennsieve API2 endpoint | https://api2.pennsieve.net |
INTEGRATION_ID |
Workflow instance ID | - |
See LICENSE file.