Sylvain Schmitt April 20, 2021
singularity &
snakemake
workflow to detect mutations with several alignment and mutation
detection tools.
- Python ≥3.5
- Snakemake ≥5.24.1
- Golang ≥1.15.2
- Singularity ≥3.7.3
- This workflow
# Python
sudo apt-get install python3.5
# Snakemake
sudo apt install snakemake`
# Golang
export VERSION=1.15.8 OS=linux ARCH=amd64 # change this as you need
wget -O /tmp/go${VERSION}.${OS}-${ARCH}.tar.gz https://dl.google.com/go/go${VERSION}.${OS}-${ARCH}.tar.gz && \
sudo tar -C /usr/local -xzf /tmp/go${VERSION}.${OS}-${ARCH}.tar.gz
echo 'export GOPATH=${HOME}/go' >> ~/.bashrc && \
echo 'export PATH=/usr/local/go/bin:${PATH}:${GOPATH}/bin' >> ~/.bashrc && \
source ~/.bashrc
# Singularity
mkdir -p ${GOPATH}/src/github.com/sylabs && \
cd ${GOPATH}/src/github.com/sylabs && \
git clone https://github.com/sylabs/singularity.git && \
cd singularity
git checkout v3.7.3
cd ${GOPATH}/src/github.com/sylabs/singularity && \
./mconfig && \
cd ./builddir && \
make && \
sudo make install
# detect Mutations
git clone git@github.com:sylvainschmitt/detectMutations.git
cd detectMutationsGenerate data using the generate Mutations workflow.
git clone git@github.com:sylvainschmitt/generateMutations.git
cd ../generateMutations
snakemake --use-singularity --cores 4
cd ../detectMutations
bash scripts/get_data.shsnakemake -np # dry run
snakemake --dag | dot -Tsvg > dag/dag.svg # dag
snakemake --use-singularity --cores 4 # run
snakemake --use-singularity --cores 1 --verbose # debug
snakemake --report report.html # reportmodule purge ; module load bioinfo/snakemake-5.25.0 # for test on node
snakemake -np # dry run
sbatch job.sh ; watch 'squeue -u sschmitt' # run
less detMut.*.err # snakemake outputs, use MAJ+F
less detMut.*.out # snakemake outputs, use MAJ+F
snakemake --dag | dot -Tsvg > dag/dag.svg # dag
module purge ; module load bioinfo/snakemake-5.8.1 ; module load system/Python-3.6.3 # for report
snakemake --report report.html # report
module purge ; module load system/R-3.6.2 ; R # to build resultsIndex reference and SNPs for software to work with.
- Tools:
BWA index - Singularity: oras://registry.forgemia.inra.fr/gafl/singularity/bwa/bwa:latest
- Tools:
samtools faidx - Singularity: oras://registry.forgemia.inra.fr/gafl/singularity/samtools/samtools:latest
- Tools:
gatk CreateSequenceDictionary - Singularity: docker://broadinstitute/gatk:4.2.6.1
- Tools:
gatk IndexFeatureFile - Singularity: docker://broadinstitute/gatk:4.2.6.1
CReport quality and trim.
- Tools:
fastQC - Singularity: docker://biocontainers/fastqc:v0.11.9_cv8
- Tools:
Trimmomatic - Singularity: oras://registry.forgemia.inra.fr/gafl/singularity/trimmomatic/trimmomatic:latest
Align reads against reference, mark duplicated, and report alignment quality.
- Tools:
BWA mem - Singularity: oras://registry.forgemia.inra.fr/gafl/singularity/bwa/bwa:latest
- Tools:
Samtools sort - Singularity: oras://registry.forgemia.inra.fr/gafl/singularity/samtools/samtools:latest
- Tools:
Samtools index - Singularity: oras://registry.forgemia.inra.fr/gafl/singularity/samtools/samtools:latest
- Tools:
gatk MarkDuplicates - Singularity: docker://broadinstitute/gatk:4.2.6.1
- Tools:
Samtools index - Singularity: oras://registry.forgemia.inra.fr/gafl/singularity/samtools/samtools:latest
- Tools:
Samtools mpileup - Singularity: oras://registry.forgemia.inra.fr/gafl/singularity/samtools/samtools:latest
- Tools:
Samtools stats - Singularity: oras://registry.forgemia.inra.fr/gafl/singularity/samtools/samtools:latest
- Tools:
QualiMap - Singularity: docker://pegi3s/qualimap:2.2.1
Detect mutations.
- Tools:
gatk Mutect2 - Singularity: docker://broadinstitute/gatk:4.2.6.1
- Tools:
freebayes - Singularity: oras://registry.forgemia.inra.fr/gafl/singularity/freebayes/freebayes:latest
- Tools:
gatk HaplotypeCaller - Singularity: docker://broadinstitute/gatk:4.2.6.1
- Tools:
gatk GenotypeGVCFs - Singularity: docker://broadinstitute/gatk:4.2.6.1
- Tools:
Strelka2 - Singularity: docker://quay.io/wtsicgp/strelka2-manta
- Tools:
VarScan - Singularity: docker://alexcoppe/varscan
- Script:
varscan2vcf.R - Singularity: https://github.com/sylvainschmitt/singularity-template/releases/download/0.0.1/sylvainschmitt-singularity-tidyverse-Biostrings.latest.sif
- Tools:
Somatic Sniper - Singularity: docker://lethalfang/somaticsniper:1.0.5.0
- Tools:
MuSe - Singularity: docker://opengenomics/muse:v0.1.1
- Tools:
cp
- Tools:
bedtools substract - Singularity: oras://registry.forgemia.inra.fr/gafl/singularity/bedtools/bedtools:latest
Combined quality information from QualiMap, Picard, Samtools,
Trimmomatic, and FastQC (see previous steps) and assess calls
performance.
- Tools:
MultiQC - Singularity: oras://registry.forgemia.inra.fr/gafl/singularity/multiqc/multiqc:latest
- Script:
evaluate_call.R - Singularity: https://github.com/sylvainschmitt/singularity-template/releases/download/0.0.1/sylvainschmitt-singularity-tidyverse-Biostrings.latest.sif
module load system/singularity-3.7.3
singularity pull https://github.com/sylvainschmitt/singularity-r-bioinfo/releases/download/0.0.3/sylvainschmitt-singularity-r-bioinfo.latest.sif
singularity shell sylvainschmitt-singularity-r-bioinfo.latest.sif
library(tidyverse)
lapply(list.files("results/stats", full=T), read_tsv) %>%
bind_rows() %>%
write_tsv("stats.tsv")
quit()
exit