Skip to content

Difference of running on Local and on Terra #437

@shengqh

Description

@shengqh

I wrote a WDL for STAR-Fusion: https://github.com/shengqh/warp/blob/develop/pipelines/vumc_biostatistics/rnaseq/VUMCStarFusion.wdl. I tested the workflow on both local cluster and Terra using same raw FASTQ file, same reference GRCh38_gencode_v37_CTAT_lib_Mar012021.plug-n-play.tar.gz and same parameters. The weird thing is, in the STAR step, the local cluster can generate correct Chimeric.out.junction while Terra one generate a file with header and comment only.

Local cluster result:

>tail -n 3 Chimeric.out.junction
chr2    84906088        +       chr2    235882419       -       0       0       1       A00499:1296:H7HFJDSXF:4:2671:11641:20306:TAGAGAAGG      84906002        8S86M24S    235882395       24M94S  1       118     96      103     103     1       GRPundef
# 2.7.11a   /usr/local/bin/STAR --genomeDir /cromwell-executions/VUMCStarFusion/4267711a-d4b4-42b9-88de-29980b41062d/call-STARFusion/execution/genome_dir/ctat_genome_lib_build_dir/ref_genome.fa.star.idx --outReadsUnmapped None --chimSegmentMin 12 --chimJunctionOverhangMin 8 --chimOutJunctionFormat 1 --alignSJDBoverhangMin 10 --alignMatesGapMax 100000 --alignIntronMax 100000 --alignSJstitchMismatchNmax 5 -1 5 5 --runThreadN 12 --outSAMstrandField intronMotif --outSAMunmapped Within --alignInsertionFlush Right --alignSplicedMateMapLminOverLmate 0 --alignSplicedMateMapLmin 30 --outSAMtype BAM Unsorted --readFilesIn /cromwell-executions/VUMCStarFusion/4267711a-d4b4-42b9-88de-29980b41062d/call-STARFusion/execution/TL-25-ZZ9D50NI4G_T_RSQ1_1.fastq.gz /cromwell-executions/VUMCStarFusion/4267711a-d4b4-42b9-88de-29980b41062d/call-STARFusion/execution/TL-25-ZZ9D50NI4G_T_RSQ1_3.fastq.gz --outSAMattrRGline ID:GRPundef --chimMultimapScoreRange 3 --chimScoreJunctionNonGTAG -4 --chimMultimapNmax 20 --chimOutType Junctions WithinBAM --chimNonchimScoreDropMin 10 --peOverlapNbasesMin 12 --peOverlapMMp 0.1 --genomeLoad NoSharedMemory --twopassMode None --readFilesCommand "gunzip -c" --quantMode GeneCounts
# Nreads 37358191       NreadsUnique 34233825   NreadsMulti 2176332

Terra result:

>zcat TL-25-ZZ9D50NI4G_Chimeric.out.junction.gz
chr_donorA      brkpt_donorA    strand_donorA   chr_acceptorB   brkpt_acceptorB strand_acceptorB        junction_type   repeat_left_lenA        repeat_right_lenB  read_name        start_alnA      cigar_alnA      start_alnB      cigar_alnB      num_chim_aln    max_poss_aln_score      non_chim_aln_score      this_chim_aln_scorebestall_chim_aln_score   PEmerged_bool   readgrp
# 2.7.11a   /usr/local/bin/STAR --genomeDir /mnt/disks/cromwell_root/genome_dir/ctat_genome_lib_build_dir/ref_genome.fa.star.idx --outReadsUnmapped None --chimSegmentMin 12 --chimJunctionOverhangMin 8 --chimOutJunctionFormat 1 --alignSJDBoverhangMin 10 --alignMatesGapMax 100000 --alignIntronMax 100000 --alignSJstitchMismatchNmax 5 -1 5 5 --runThreadN 12 --outSAMstrandField intronMotif --outSAMunmapped Within --alignInsertionFlush Right --alignSplicedMateMapLminOverLmate 0 --alignSplicedMateMapLmin 30 --outSAMtype BAM Unsorted --readFilesIn /mnt/disks/cromwell_root/TL-25-ZZ9D50NI4G_T_RSQ1_1.fastq.gz /mnt/disks/cromwell_root/TL-25-ZZ9D50NI4G_T_RSQ1_3.fastq.gz --outSAMattrRGline ID:GRPundef --chimMultimapScoreRange 3 --chimScoreJunctionNonGTAG -4 --chimMultimapNmax 20 --chimOutType Junctions WithinBAM --chimNonchimScoreDropMin 10 --peOverlapNbasesMin 12 --peOverlapMMp 0.1 --genomeLoad NoSharedMemory --twopassMode None --readFilesCommand "gunzip -c" --quantMode GeneCounts
# Nreads 37358191       NreadsUnique 34233825   NreadsMulti 2176332

After removing all the path difference between two command lines, they are identical. I am wondering what would cause this problem. Thanks.

Best,

Tiger

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions