Xenium spatial transcriptomics analysis pipeline built with Nextflow and Singularity/Apptainer. The pipeline uses cv2 to automatically identify tissue contours and bounding boxes; it is designed to analyse tissue microarrays and multi-sample slides.
- Build a container
sopa.siffrom the definition file from this repository. - Export
CONTAINERDIRandPROJECTDIRvariables. The former is used to find the container, and the latter to mount the data to singularity during run. - Adjust resources in resources.config for pipeline steps.
- Create
run01.configin config directory from run.template to configure step parameters. - Create environment with nextflow installation using
make env. - Run the pipeline using
make run.
The pipeline has 4 steps starting from raw xenium output to identification of gene programs using cNMF.
- CONVERT_XENIUM converts raw data to spatialdata-formatted zarr archive for downstream processing.
- DETECT_TISSUE automatically identifies tissue contours and bouding boxes to split multi-sample slides into separate sample objects for independent analysis. Especially helpful when working with tissue microarrays.
- SPLIT_SAMPLES creates one AnnData h5ad archive per sample based on the tissue contours.
- IDENTIFY_PROGRAMS runs cNMF for each sample. Note that it doesn't automatically select the best number of programs in this pipeline leaving this task for user.