|
Developed by the Bork Group in collaboration with nf-core Raise an issue or contact us See our other Software & Services |
Contributors: |
Collaborators: |
The development of this workflow was supported by NFDI4Microbiota
|
|||
The ENA2eggNOG workflow is a nextflow workflow for fast functional annotation of novel sequences. It uses precomputed orthologous groups and phylogenies from the eggNOG database to transfer functional information from fine-grained orthologs only.
Common uses of eggNOG-mapper include the annotation of novel genomes, transcriptomes, or even metagenomic gene catalogs.
The use of orthology predictions for functional annotation permits a higher precision than traditional homology searches (i.e. BLAST searches), as it avoids transferring annotations from close paralogs (duplicate genes with a higher chance of being involved in functional divergence).
Benchmarks comparing different eggNOG-mapper options against BLAST and InterProScan can be found here.
Also cite:
Cantalapiedra CP, Hernández-Plaza A, Letunic I, Bork P, Huerta-Cepas J. eggNOG-mapper v2: Functional Annotation, Orthology Assignments, and Domain Prediction at the Metagenomic Scale. Mol Biol Evol. 2021;38(12):5825-5829. doi:10.1093/molbev/msab293
Ewels PA, Peltzer A, Fillinger S, et al. The nf-core framework for community-curated bioinformatics pipelines. Nat Biotechnol. 2020;38(3):276-278. doi:10.1038/s41587-020-0439-x
Li D, Liu CM, Luo R, Sadakane K, Lam TW. MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics. 2015;31(10):1674-1676. doi:10.1093/bioinformatics/btv033
Hyatt D, Chen GL, Locascio PF, Land ML, Larimer FW, Hauser LJ. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics. 2010;11:119. Published 2010 Mar 8. doi:10.1186/1471-2105-11-119
An extensive list of references for the tools used by the pipeline can be found in the CITATIONS.md file.
- Download data from ENA/SRA (
fetchngs) - Run assembly (
MEGAHIT) - Predict genes(
Prodigal) - Annotate genes (
eggnog-mapper)
This workflow will be available on the CloWM platform (coming soon).
If you are new to Nextflow and nf-core, please refer to this page on how
to set-up Nextflow. Make sure to test your setup
with -profile test before running the workflow on actual data.
You can run the pipeline using:
nextflow run eggnogmapper \
-profile <docker/singularity/.../institute> \
--input ids.csv \
--outdir <OUTDIR>The input is a csv file with a list of ENA accession IDs that looks as follows:
ids.csv:
PRJEB6102
SRR9984183
SRR13191702Each can be a project ID or a run ID.

