Skip to content

V1.0.0#1

Open
guanqiaofeng wants to merge 6 commits intomainfrom
v1.0.0
Open

V1.0.0#1
guanqiaofeng wants to merge 6 commits intomainfrom
v1.0.0

Conversation

@guanqiaofeng
Copy link
Copy Markdown
Contributor

RNA Seq Alignment Workflow Version 1.0.0
Please refer to README for test details

@guanqiaofeng guanqiaofeng self-assigned this Sep 10, 2024
Comment thread CITATIONS.md Outdated

## Pipeline tools

- [FastQC](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unneeded or replace with relevant ones

Comment thread CITATIONS.md
Comment on lines +11 to +18
## Pipeline tools

- [GffRead](https://pubmed.ncbi.nlm.nih.gov/32489650/)

> Pertea G, Pertea M. GFF Utilities: GffRead and GffCompare. F1000Res. 2020 Apr 28;9:ISCB Comm J-304. doi: 10.12688/f1000research.23297.2. eCollection 2020. PubMed PMID: 32489650; PubMed Central PMCID: PMC7222033.

- [HISAT2](https://pubmed.ncbi.nlm.nih.gov/31375807/)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated pipeline tools

samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//')
END_VERSIONS
"""
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing Stub for preview

output:
tuple val(meta), path("*.hisat2_Aligned.bam") , emit: bam
tuple val(meta), path("*_summary.txt") , emit: summary
tuple val(meta), path("*fastq.gz"), optional:true, emit: fastq
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this for unmapped reads?

I'm not sure we're producing unmapped reads:
https://daehwankimlab.github.io/hisat2/manual/#:~:text=in%20the%20input.-,%2D%2Dun%2Dconc,-%3Cpath%3E%2C

Comment thread workflows/rnaaln.nf
genome_annotation: "${params.genome_annotation}",
read_groups_count: "${meta.numLanes}",
study_id : "${meta.study_id}",
date :"${new Date().format("yyyyMMdd")}",
Copy link
Copy Markdown

@edsu7 edsu7 Sep 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Date was defined twice? Ideally date should be set prior to payload generation. If it's used before then we run the risk of a workflow terminating and duplicate work being generated b/c of a new date variable.

Comment thread workflows/rnaaln.nf
.set{ch_h_aln_payload}

// Make ALN payload
PAYLOAD_ALIGNMENT_H( // [val (meta), [path(cram),path(crai)],path(analysis_json)]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nitpick about the comment. Should be inline with the variable.
e.g.

        PAYLOAD_ALIGNMENT_H( 
            ch_h_aln_payload.upload,  // [val (meta), [path(cram),path(crai)],path(analysis_json)]
            Channel.empty()
            .mix(STAGE_INPUT.out.versions)
            .mix(HISAT2_ALIGN.out.versions)
            .mix(MERG_DUP_H.out.versions)
            .collectFile(name: 'collated_versions.yml')
        )

Comment thread workflows/rnaaln.nf
experiment:"${meta.experiment}",
date:"${meta.date}",
read_group:"${info.read_group.collect()}",
data_type:"${info.data_type.collect()}", // later check whether data type is correct **
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left over note?

Comment thread workflows/rnaaln.nf
if (params.tools.split(',').contains('hisat2_aln')){

// HISAT2 - ALIGN //
index = Channel.fromPath(params.hisat2_index).collect()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't recall, was it decided to add indexing step into the workflow?

Comment thread workflows/rnaaln.nf
ch_multiqc = Channel.empty()
ch_multiqc = ch_multiqc.mix(ch_reports.collect{meta, report -> report}).ifEmpty([])

ch_multiqc_config = Channel.fromPath("$projectDir/assets/multiqc_config.yml", checkIfExists: true)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For defining variable, better to declare all files at the start of workflow. Easier management and readability

Comment thread main.nf
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
*/

params.study_id = WorkflowMain.getGenomeAttribute(params, 'study_id')
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown

@edsu7 edsu7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above comments

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants