-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
Description
Inputs:
- 0-x unpaired fastq/sff files
- 0-y pairs of illumina paired fastq files
- [0-z pacbio files]
Transformation:
- sff -> fastq
Filter:
- drop reads with Ns
- drop illumina reads where the index has a smallest quality score lower than some minimum
- De novo: run host_filter
- Note: individual files may be lost or empty after this process
Transformation (trimming):
- cut 3' and 5' adapters
-a,-A,-g,-G - cut 3' and 5' primers
- trim all N's from end of reads
--trim-n - trim low quality bases from 5' and 3'
-q <five>,<three>or-q <five> - remove X bases from beginning/end of reads
-u <X> - [run something on pacbio reads separately]
Reduction (mapping/assembly): -- This is where denovo/mapping branch
- Mapping:
- compile paired, unpaired reads
- run bwa on compiled reads, separaetly
- [run bwa/other mapper on pacbio reads, then merge bam files]
- merge paired and unpaired bam files, if they exist
- sort, index bam file
- tag the bam file via tagreads
- run
freebayeson the bam file - create a consensus fasta
- DeNovo:
- compile paired and unpaired (?)
- run a De Novo assembler (sga and spades are both in bioconda, but not ray)
- try to figure out the number of reads for each contig
- maybe simulate iterative_blast