Skip to content

Containment estimation of genomes, proteomes, plasmids and other sequences

License

Notifications You must be signed in to change notification settings

metashot/containment

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

metashot/containment

metashot/containment is a workflow for the containment estimation of genomes, proteomes, plasmids and other sequences in sequencing read sets using mash screen.

Main features

  • Input: single-end, paired-end sequences in FASTA/FASTQ formats (gzip compressed files also supported);
  • genomes or sequences on which test the containment in FASTA or mash sketch formats;
  • returns the multeplicity matrix (genomes x read sets).

Quick start (examples)

Install Docker (or Singulariry) and Nextflow (see Dependencies);

Example 1 - genomes containment in single-end or interleaved read sets

nextflow run metashot/containment \
  --reads 'reads/*.fastq.gz' \
  --db 'genomes/*.fa' \
  --outdir results

Example 2 - genomes containment in paired-end read sets

nextflow run metashot/containment \
  --reads 'reads/*_R{1,2}*.fastq.gz' \
  --db 'genomes/*.fa' \
  --outdir results

Example 3 - sequences containment in paired-end read sets

nextflow run metashot/containment \
  --reads 'reads/*_R{1,2}*.fastq.gz' \
  --db sequences.fa \
  --individual \
  --outdir results

Parameters

Options and default values are decladed in nextflow.config.

System requirements

Please refer to System requirements for the complete list of system requirements options.

About

Containment estimation of genomes, proteomes, plasmids and other sequences

Resources

License

Stars

Watchers

Forks

Packages

No packages published