Proposed Pipeline

**Inputs:**
- [ ] 0-x unpaired fastq/sff files
- [ ] 0-y pairs of illumina paired fastq files
- [ ] [0-z pacbio files]

**Transformation:**
- [x] sff -> fastq

**Filter:**
- [ ] drop reads with Ns
- [x] drop illumina reads where the index has a smallest quality score lower than some minimum
- [ ] **De novo**: run host_filter
-  Note: individual files may be lost or empty after this process

**Transformation (trimming)**:
- [ ] cut 3' and 5' adapters `-a`, `-A`, `-g`, `-G`
- [ ] cut 3' and 5' primers
- [x] trim all N's from end of reads `--trim-n`
- [x] trim low quality bases from 5' and 3' `-q <five>,<three>` or `-q <five>`
- [x] remove X bases from beginning/end of reads `-u <X>`
- [ ] [run something on pacbio reads separately]

**Reduction (mapping/assembly)**: -- This is where denovo/mapping branch
- Mapping:
  - [x] compile paired, unpaired reads
  - [ ] run bwa on compiled reads, separaetly
  - [ ] [run bwa/other mapper on pacbio reads, then merge bam files]
  - [ ] merge paired and unpaired bam files, if they exist
  - [ ] sort, index bam file
  - [ ] tag the bam file via tagreads
  - [ ] run `freebayes` on the bam file
  - [ ] create a consensus fasta 
-  DeNovo:
  - [ ] compile paired and unpaired (?)
  - [ ] run a De Novo assembler (sga and spades are both in bioconda, but not ray)
  - [ ] try to figure out the number of reads for each contig
  - [ ] maybe simulate iterative_blast 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposed Pipeline #7

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Proposed Pipeline #7

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions