Skip to content

janschleicher/tidesurf

Repository files navigation

Python 3.10 Python 3.11 Python 3.12 Python 3.13 License: GPL v3 Code style: ruff codecov Python package PyPI - Version Documentation Status

tidesurf

This repository provides a Tool for IDentification and Enumeration of Spliced and Unspliced Read Fragments using Python.

Installation

From PyPI

Set up a virtual environment using Conda with Python version >=3.10 and activate it (here: using Python 3.12):

conda create -n <envName> python=3.12
conda activate <envName>

Install the package from PyPI:

pip install tidesurf

Latest version from GitHub

Clone the repository:

git clone git@github.com:janschleicher/tidesurf.git

Change into the directory and install with pip:

cd tidesurf
pip install -e .

Usage

usage: tidesurf [-h] [-v] [--orientation {sense,antisense}] [-o OUTPUT]
                [--no_filter_cells]
                [--whitelist WHITELIST | --num_umis NUM_UMIS]
                [--min_intron_overlap MIN_INTRON_OVERLAP]
                [--multi_mapped_reads]
                SAMPLE_DIR GTF_FILE

Program: tidesurf (Tool for IDentification and Enumeration of Spliced and Unspliced Read Fragments)
Version: 0.2.1

positional arguments:
  SAMPLE_DIR            Sample directory containing Cell Ranger output.
  GTF_FILE              GTF file with transcript information.

options:
  -h, --help            show this help message and exit
  -v, --version         show program's version number and exit
  --orientation {sense,antisense}
                        Orientation of reads with respect to transcripts. For
                        10x Genomics, use 'sense' for three prime and
                        'antisense' for five prime.
  -o OUTPUT, --output OUTPUT
                        Output directory.
  --no_filter_cells     Do not filter cells.
  --whitelist WHITELIST
                        Whitelist for cell filtering. Set to 'cellranger' to
                        use barcodes in the sample directory. Alternatively,
                        provide a path to a whitelist.
  --num_umis NUM_UMIS   Minimum number of UMIs for filtering a cell.
  --min_intron_overlap MIN_INTRON_OVERLAP
                        Minimum number of bases that a read must overlap with
                        an intron to be considered intronic.
  --multi_mapped_reads  Take reads mapping to multiple genes into account
                        (default: reads mapping to more than one gene are
                        discarded).

Contributing

For contributing, you should install tidesurf in development mode:

pip install -e ".[dev]"

This will install the additional dependencies ruff and pytest, which are used for formatting and code style, and testing, respectively. Please run these before commiting new code.

Citation

If you use tidesurf in your research, please cite the following publication:

Schleicher, J.T., and Claassen, M. (2025). Accurate quantification of spliced and unspliced transcripts for single-cell RNA sequencing with tidesurf. bioRxiv 2025.01.28.635274; DOI: 10.1101/2025.01.28.635274.

About

Tool for IDentification and Enumeration of Spliced and Unspliced Read Fragments

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors