This repository provides a Tool for IDentification and Enumeration of Spliced and Unspliced Read Fragments using Python.
Set up a virtual environment using Conda with Python version >=3.10 and activate it (here: using Python 3.12):
conda create -n <envName> python=3.12
conda activate <envName>
Install the package from PyPI:
pip install tidesurf
Clone the repository:
git clone git@github.com:janschleicher/tidesurf.git
Change into the directory and install with pip:
cd tidesurf
pip install -e .
usage: tidesurf [-h] [-v] [--orientation {sense,antisense}] [-o OUTPUT]
[--no_filter_cells]
[--whitelist WHITELIST | --num_umis NUM_UMIS]
[--min_intron_overlap MIN_INTRON_OVERLAP]
[--multi_mapped_reads]
SAMPLE_DIR GTF_FILE
Program: tidesurf (Tool for IDentification and Enumeration of Spliced and Unspliced Read Fragments)
Version: 0.2.1
positional arguments:
SAMPLE_DIR Sample directory containing Cell Ranger output.
GTF_FILE GTF file with transcript information.
options:
-h, --help show this help message and exit
-v, --version show program's version number and exit
--orientation {sense,antisense}
Orientation of reads with respect to transcripts. For
10x Genomics, use 'sense' for three prime and
'antisense' for five prime.
-o OUTPUT, --output OUTPUT
Output directory.
--no_filter_cells Do not filter cells.
--whitelist WHITELIST
Whitelist for cell filtering. Set to 'cellranger' to
use barcodes in the sample directory. Alternatively,
provide a path to a whitelist.
--num_umis NUM_UMIS Minimum number of UMIs for filtering a cell.
--min_intron_overlap MIN_INTRON_OVERLAP
Minimum number of bases that a read must overlap with
an intron to be considered intronic.
--multi_mapped_reads Take reads mapping to multiple genes into account
(default: reads mapping to more than one gene are
discarded).
For contributing, you should install tidesurf in development mode:
pip install -e ".[dev]"
This will install the additional dependencies ruff and pytest, which are used for formatting and code style, and testing, respectively.
Please run these before commiting new code.
If you use tidesurf in your research, please cite the following publication:
Schleicher, J.T., and Claassen, M. (2025). Accurate quantification of spliced and unspliced transcripts for single-cell RNA sequencing with tidesurf. bioRxiv 2025.01.28.635274; DOI: 10.1101/2025.01.28.635274.