Skip to content

Latest commit

 

History

History
115 lines (79 loc) · 7.74 KB

File metadata and controls

115 lines (79 loc) · 7.74 KB

Open in GitHub Codespaces GitHub Actions CI Status GitHub Actions Linting Status Cite with Zenodo nf-test Nextflow nf-core template version run with docker run with singularity Launch on Seqera Platform

Nallo logo

Introduction

genomic-medicine-sweden/nallo is a bioinformatics analysis pipeline for long-reads from both PacBio and (targeted) ONT-data, focused on rare-disease. Heavily influenced by best-practice pipelines such as nf-core/sarek, nf-core/raredisease, nf-core/nanoseq, PacBio Human WGS Workflow, epi2me-labs/wf-human-variation and brentp/rare-disease-wf.

genomic-medicine-sweden/nallo workflow
QC
Alignment & assembly
  • Assemble genomes with hifiasm
  • Align reads and assemblies to reference with minimap2
Variant calling
Phasing and methylation
Annotation
Ranking
  • Rank SNVs, INDELs, SVs and CNVs with GENMOD
Filtering

Usage

Note

If you are new to Nextflow and nf-core, please refer to this page on how to set-up Nextflow. Make sure to test your setup with -profile test before running the workflow on actual data.

Prepare a samplesheet with input data:

samplesheet.csv

project,sample,file,family_id,paternal_id,maternal_id,sex,phenotype
 my_project,HG002,/path/to/HG002.fastq.gz,NIST,HG003,HG004,1,2
 my_project,HG003,/path/to/HG003.bam,NIST,0,0,1,1
 my_project,HG004,/path/to/HG004.bam,NIST,0,0,2,1

Supply a reference genome with --fasta and choose a matching --preset for your data (revio, pacbio, ONT_R10). Now, you can run the pipeline using:

nextflow run genomic-medicine-sweden/nallo \
    -profile <docker/singularity/.../institute> \
    --input samplesheet.csv \
    --preset <revio/pacbio/ONT_R10> \
    --fasta <reference.fasta> \
    --outdir <OUTDIR>

However, to run most parts of the pipeline you will need to supply additional reference files. For more details and further functionality, please refer to the documentation.

Credits

genomic-medicine-sweden/nallo was originally written by Felix Lenner.

We thank the following people for their extensive assistance in the development of this pipeline: Anders Jemt, Annick Renevey, Daniel Schmitz, Lucía Peña-Pérez, Peter Pruisscher, Ramprasad Neethiraj & Alexander Koc.

Contributions and Support

If you would like to contribute to this pipeline, please see the contributing guidelines.

Citations

If you use genomic-medicine-sweden/nallo for your analysis, please cite it using the following doi: 10.5281/zenodo.13748210.

This pipeline uses code and infrastructure developed and maintained by the nf-core community, reused here under the MIT license.

The nf-core framework for community-curated bioinformatics pipelines.

Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen.

Nat Biotechnol. 2020 Feb 13. doi: 10.1038/s41587-020-0439-x.

An extensive list of references for the tools used by the pipeline can be found in the docs/CITATIONS.md file.