Skip to content
Open
17 changes: 14 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,15 +52,17 @@ See below for information on usage and local installation.
```
See full options below:
```
usage: pangolin [-h] [-c COLUMN_IDS] [-m {False,True}] [-s SCORE_CUTOFF] [-d DISTANCE] variant_file reference_file annotation_file output_file
usage: pangolin [-h] [-c COLUMN_IDS] [-m {False,True}] [-s SCORE_CUTOFF] [-d DISTANCE] [--score_exons {False,True}] [--loglevel {DEBUG,INFO,WARNING,ERROR,CRITICAL}] [--tmpdir TMPDIR] [--variant_batchsize VARIANT_BATCHSIZE]
[--tensor_batchsize TENSOR_BATCHSIZE]
variant_file reference_file annotation_file output_file

positional arguments:
variant_file VCF or CSV file with a header (see COLUMN_IDS option).
reference_file FASTA file containing a reference genome sequence.
annotation_file gffutils database file. Can be generated using create_db.py.
output_file Prefix for output file. Will be a VCF/CSV if variant_file is VCF/CSV.

optional arguments:
options:
-h, --help show this help message and exit
-c COLUMN_IDS, --column_ids COLUMN_IDS
(If variant_file is a CSV) Column IDs for: chromosome, variant position, reference bases, and alternative bases. Separate IDs by commas. (Default: CHROM,POS,REF,ALT)
Expand All @@ -70,6 +72,15 @@ See below for information on usage and local installation.
Output all sites with absolute predicted change in score >= cutoff, instead of only the maximum loss/gain sites.
-d DISTANCE, --distance DISTANCE
Number of bases on either side of the variant for which splice scores should be calculated. (Default: 50)
--score_exons {False,True}
Output changes in score for both splice sites of annotated exons, as long as one splice site is within the considered range (specified by -d). Output will be: gene|site1_pos:score|site2_pos:score|...
--loglevel {DEBUG,INFO,WARNING,ERROR,CRITICAL}
Set the logging level. (Default: INFO)
--tmpdir TMPDIR Location to create temporary directory for storing intermediate files.
--variant_batchsize VARIANT_BATCHSIZE
Number of variants to score in a single CPU batch. (Default: 1280)
--tensor_batchsize TENSOR_BATCHSIZE
Number of variants to process in a single GPU batch. (Default: 128)
```

### Usage (custom)
Expand Down
57 changes: 57 additions & 0 deletions docker/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
######################################
## CONTAINER FOR GPU based pangolin ##
######################################

# start from the cuda docker base
from nvidia/cuda:12.0.0-runtime-ubuntu22.04

## needed apt packages
ARG BUILD_PACKAGES="wget git bzip2"
# needed conda packages
ARG CONDA_PACKAGES="python==3.10.8 pip==25.0 pandas==2.2.3 pyfastx==0.8.4 gffutils==0.13 pytorch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 pytorch-cuda==12.4 pysam==0.20.0"
ARG CONDA_CHANNEL="-c nvidia -c pytorch -c conda-forge -c anaconda -c bioconda"
## ENV SETTINGS during runtime
ENV LANG=C.UTF-8 LC_ALL=C.UTF-8
ENV PATH=/opt/conda/bin:$PATH
ENV DEBIAN_FRONTEND noninteractive

## AUTHOR
ENV AUTHOR="Geert Vandeweyer"
ENV EMAIL="geert.vandeweyer@uza.be"

# For micromamba:
SHELL ["/bin/bash", "-l", "-c"]
ENV MAMBA_ROOT_PREFIX=/opt/conda/
ENV PATH=/opt/micromamba/bin:/opt/conda/bin:$PATH


## INSTALL
RUN apt-get -y update && \
apt-get -y install $BUILD_PACKAGES && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*


# conda packages
RUN mkdir /opt/conda && \
mkdir /opt/micromamba && \
wget -qO - https://micromamba.snakepit.net/api/micromamba/linux-64/0.23.0 | tar -xvj -C /opt/micromamba bin/micromamba && \
# initialize bash
micromamba shell init --shell=bash --prefix=/opt/conda && \
# remove a statement from bashrc that prevents initialization
grep -v '[ -z "\$PS1" ] && return' /root/.bashrc > /opt/micromamba/bashrc && \
mv /opt/micromamba/bashrc /root/.bashrc && \
source ~/.bashrc && \
# activate & install base conda packag
micromamba activate && \
micromamba install -y $CONDA_CHANNEL $CONDA_PACKAGES && \
micromamba clean --all --yes

# my fork of pangolin : has gpu optimizations
RUN cd /opt/ && \
git clone https://github.com/geertvandeweyer/pangolin.git && \
cd pangolin && \
pip install .

# ADD annotation data
CMD ["pangolin", "--help"]
257 changes: 0 additions & 257 deletions pangolin/.fuse_hidden0000252700000002

This file was deleted.

Loading