Skip to content

DEE2 Docker container fails with count file mismatch error #107

@fejung

Description

@fejung

Hello,

I wanted to run the DEE2 pipeline as a Docker container on some Arabidopsis samples that were not in the database. However, when running the Docker container, I consistently encounter the following error:

An error occurred. Count file line numbers don't match the reference.

This error appears when running Kallisto. Before the Kallisto step begins, I also receive:

/dee2/code/volunteer_pipeline.sh: line 140: 486 Segmentation fault (core dumped) $STAR --runThreadN $THREADS --quantMode GeneCounts --genomeLoad LoadAndKeep --outSAMtype None --genomeDir $STAR_DIR --readFilesIn=$FQ1 $FQ2

Initially, I thought this might explain why these samples are not in the data archive. However, I encounter the same error when running the Docker container on the examples provided in the README ("SRR2637695, SRR2637696, SRR2637697, SRR2637698") and on the test example "SRR057750".
The results are the same with "latest" and" dev" images.

Do you have any ideas about what might be causing this problem?
If relevant, I'm running the Docker container on Windows workstations.

Many thanks!

Stack trace of "docker run -it mziemann/tallyup:dev -s ecoli -a SRR27386505 -v"
docker run -it mziemann/tallyup:dev -s ecoli -a SRR27386505 -v
+ VERBOSE=TRUE
+ '[' SRR27386505 '!=' NULL ']'
+ MODE=ACCESSION
+ '[' NULL '!=' NULL ']'
+ '[' FALSE == TRUE ']'
+ DEE_DIR=/dee2
+ export -f main
+ cd /dee2
++ find /dee2/ref/
++ grep '/ensembl/star$'
++ sed 's#\/code\/\.\.##'
+ for DIR in '$(find $DEE_DIR/ref/ | grep /ensembl/star$ | sed '\''s#\/code\/\.\.##'\'' )'
+ --genomeLoad Remove --genomeDir /dee2/ref/ecoli/ensembl/star
++ awk '{print $1+$2}'
+++ free
+++ awk '$1 ~ /Mem:/  {print $2-$3}'
+++ free
+++ awk '$1 ~ /Swap:/  {print $2-$3}'
++ echo 15340872 4194304
+ MEM=19535176
++ grep -c '^processor' /proc/cpuinfo
+ NUM_CPUS=16
++ lscpu
++ grep MHz
++ awk '{print $NF}'
++ head -1
+ CPU_SPEED=3393.685
+ ACC_URL=http://dee2.io/acc.html
+ ACC_REQUEST=http://dee2.io/cgi-bin/acc.sh
+ '[' '!' -z ecoli ']'
++ echo 'athaliana celegans dmelanogaster drerio ecoli hsapiens mmusculus rnorvegicus scerevisiae osativa zmays'
++ tr ' ' '\n'
++ grep -wc ecoli
+ ORG_CHECK=1
+ '[' 1 -ne 1 ']'
++ echo 'athaliana        2853904
celegans        2652204
dmelanogaster   3403644
drerio  14616592
ecoli   1576132
hsapiens        28968508
mmusculus       26069664
rnorvegicus     26913880
scerevisiae     1644684
osativa         8000000
zmays           22000000'
++ grep -w ecoli
++ awk -v f=2 '{print $2*f}'
+ MEM_REQD=3152264
+ '[' 3152264 -gt 19535176 ']'
+ '[' -z ecoli ']'
+ export -f myfunc
+ TESTFILE=test_pass
+ '[' '!' -r test_pass ']'
+ echo

+ '[' ACCESSION == FASTQ ']'
+ '[' ACCESSION == SRA_ARCHIVE ']'
+ '[' ACCESSION == ACCESSION ']'
++ echo SRR27386505
++ tr , '\n'
++ cut -c2-3
++ grep -vc RR
+ TESTACCESSIONS=0
+ '[' 0 -eq 0 ']'
++ echo SRR27386505
++ tr , ' '
+ for USER_ACCESSION in '$(echo $MY_ACCESSIONS | tr '\'','\'' '\'' '\'')'
++ pwd
+ DIR=/dee2
+ echo Starting pipeline with species ecoli and accession SRR27386505
Starting pipeline with species ecoli and accession SRR27386505
+ main ORG=ecoli ACCESSION=SRR27386505 VERBOSE=TRUE THREADS=8
++ echo ORG=ecoli ACCESSION=SRR27386505 VERBOSE=TRUE THREADS=8
++ tr ' ' '\n'
++ grep VERBOSE
++ cut -d = -f2
+ VERBOSE=TRUE
+ '[' '!' -z TRUE ']'
+ '[' TRUE == TRUE ']'
+ set -x
++ echo ORG=ecoli ACCESSION=SRR27386505 VERBOSE=TRUE THREADS=8
++ tr ' ' '\n'
++ grep THREADS
++ cut -d = -f2
+ THREADS=8
+ export -f exit1
++ echo ORG=ecoli ACCESSION=SRR27386505 VERBOSE=TRUE THREADS=8
++ tr ' ' '\n'
++ grep ORG
++ cut -d = -f2
+ ORG=ecoli
++ echo ORG=ecoli ACCESSION=SRR27386505 VERBOSE=TRUE THREADS=8
++ tr ' ' '\n'
++ grep -c ACCESSION
+ MODE=1
+ '[' 1 -eq 1 ']'
+ MODE=ACCESSION
++ echo ORG=ecoli ACCESSION=SRR27386505 VERBOSE=TRUE THREADS=8
++ tr ' ' '\n'
++ grep ACCESSION
++ cut -d = -f2
+ SRR=SRR27386505
+ SRR_FILE=SRR27386505.sra
+ echo SRR27386505
SRR27386505
+ wget -O SRR27386505.html https://www.ncbi.nlm.nih.gov/sra/SRR27386505
--2025-05-24 13:35:57--  https://www.ncbi.nlm.nih.gov/sra/SRR27386505
Resolving www.ncbi.nlm.nih.gov (www.ncbi.nlm.nih.gov)... 130.14.29.110, 2607:f220:41e:4290::110, 2607:f220:41e:4290::110
Connecting to www.ncbi.nlm.nih.gov (www.ncbi.nlm.nih.gov)|130.14.29.110|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: 'SRR27386505.html'

SRR27386505.html                  [ <=>                                              ]  47.17K  --.-KB/s    in 0.09s

2025-05-24 13:35:58 (512 KB/s) - 'SRR27386505.html' saved [48307]

++ echo ecoli
++ cut -c2-
+ ORG2=coli
++ sed 's/class=/\n/g' SRR27386505.html
++ grep Organism:
++ grep -c coli
+ ORG_OK=1
+ rm SRR27386505.html
+ '[' 1 -ne 1 ']'
+ echo User input species and SRA metadata match. OK.
User input species and SRA metadata match. OK.
+ cd /dee2
+ CODE_DIR=/dee2/code
+ PIPELINE=/dee2/code/volunteer_pipeline.sh
++ md5sum /dee2/code/volunteer_pipeline.sh
++ cut -d ' ' -f1
+ PIPELINE_MD5=bf64e7889548dca6262f0da0a1d6d9ed
+ SW_DIR=/dee2/sw
+ PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/dee2/sw
+ DATA_DIR=/dee2/data/ecoli
+ REF_DIR=/dee2/ref
+ QC_DIR=/dee2/qc
+ BOWTIE2_BUILD=/usr/local/bin/bowtie2-build
+ KALLISTO=/usr/local/bin/kallisto
+ STAR=/usr/local/bin/STAR
+ PREFETCH=/usr/local/bin/prefetch
+ VDB_VALIDATE=/usr/local/bin/vdb-validate
+ FASTQ_DUMP=/usr/local/bin/fastq-dump
+ FASTQC=/usr/local/bin/fastqc
+ NUMAVERAGE=/usr/bin/numaverage
+ NUMROUND=/usr/bin/numround
+ NUMSUM=/usr/bin/numsum
+ PARALLEL_FASTQ_DUMP=/usr/local/bin/parallel-fastq-dump
+ PBZIP2=/usr/bin/pbzip2
+ SKEWER=/usr/local/bin/skewer
+ MINION=/usr/local/bin/minion
+ UNSORT=/usr/bin/unsort
+ FASTX_TRIMMER=/usr/bin/fastx_trimmer
+ FASTQPAIRER=/dee2/code/FastqPairer.pl
+ DISKLIM=32000000
+ DLLIM=1
+ ALNLIM=2
+ MEMALNLIM=4
++ df .
++ awk 'END{print$4}'
+ DISK=990270644
++ awk '{print $1+$2}'
+++ free
+++ awk '$1 ~ /Mem:/  {print $2-$3}'
+++ free
+++ awk '$1 ~ /Swap:/  {print $2-$3}'
++ echo 15335736 4194304
+ MEM=19530040
+ '[' 990270644 -lt 32000000 ']'
+ '[' '!' -d /dee2/qc ']'
+ MYREF_DIR=/dee2/ref/ecoli/ensembl/
+ '[' '!' -d /dee2/ref/ecoli/ensembl/ ']'
+ echo ecoli
ecoli
+ '[' ecoli == athaliana ']'
+ '[' ecoli == celegans ']'
+ '[' ecoli == dmelanogaster ']'
+ '[' ecoli == drerio ']'
+ '[' ecoli == ecoli ']'
+ GTFURL=ftp://ftp.ensemblgenomes.org/pub/bacteria/release-36/gtf/bacteria_0_collection/escherichia_coli_str_k_12_substr_mg1655/Escherichia_coli_str_k_12_substr_mg1655.ASM584v2.36.gtf.gz
+ GDNAURL=ftp://ftp.ensemblgenomes.org/pub/bacteria/release-36/fasta/bacteria_0_collection/escherichia_coli_str_k_12_substr_mg1655/dna/Escherichia_coli_str_k_12_substr_mg1655.ASM584v2.dna_sm.chromosome.Chromosome.fa.gz
+ CDNAURL=ftp://ftp.ensemblgenomes.org/pub/bacteria/release-36/fasta/bacteria_0_collection/escherichia_coli_str_k_12_substr_mg1655/cdna/Escherichia_coli_str_k_12_substr_mg1655.ASM584v2.cdna.all.fa.gz
+ BT2_MD5=9fd53f70df3ba54b851713a514ef3412
+ KAL_MD5=dbc74ab4fa8d55d5f3e88476dc5cc32e
+ STAR_MD5=49dfb0bef4e1c0e34503dc995b9456e5
++ basename ftp://ftp.ensemblgenomes.org/pub/bacteria/release-36/gtf/bacteria_0_collection/escherichia_coli_str_k_12_substr_mg1655/Escherichia_coli_str_k_12_substr_mg1655.ASM584v2.36.gtf.gz .gz
+ GTF=/dee2/ref/ecoli/ensembl//Escherichia_coli_str_k_12_substr_mg1655.ASM584v2.36.gtf
+ '[' -z /dee2/ref/ecoli/ensembl//Escherichia_coli_str_k_12_substr_mg1655.ASM584v2.36.gtf ']'
+ '[' '!' -r /dee2/ref/ecoli/ensembl//Escherichia_coli_str_k_12_substr_mg1655.ASM584v2.36.gtf ']'
++ basename ftp://ftp.ensemblgenomes.org/pub/bacteria/release-36/fasta/bacteria_0_collection/escherichia_coli_str_k_12_substr_mg1655/dna/Escherichia_coli_str_k_12_substr_mg1655.ASM584v2.dna_sm.chromosome.Chromosome.fa.gz .gz
+ GDNA=/dee2/ref/ecoli/ensembl//Escherichia_coli_str_k_12_substr_mg1655.ASM584v2.dna_sm.chromosome.Chromosome.fa
+ '[' -z /dee2/ref/ecoli/ensembl//Escherichia_coli_str_k_12_substr_mg1655.ASM584v2.dna_sm.chromosome.Chromosome.fa ']'
+ '[' '!' -r /dee2/ref/ecoli/ensembl//Escherichia_coli_str_k_12_substr_mg1655.ASM584v2.dna_sm.chromosome.Chromosome.fa ']'
++ basename ftp://ftp.ensemblgenomes.org/pub/bacteria/release-36/fasta/bacteria_0_collection/escherichia_coli_str_k_12_substr_mg1655/cdna/Escherichia_coli_str_k_12_substr_mg1655.ASM584v2.cdna.all.fa.gz .gz
+ CDNA=/dee2/ref/ecoli/ensembl//Escherichia_coli_str_k_12_substr_mg1655.ASM584v2.cdna.all.fa
+ '[' -z /dee2/ref/ecoli/ensembl//Escherichia_coli_str_k_12_substr_mg1655.ASM584v2.cdna.all.fa ']'
+ '[' '!' -r /dee2/ref/ecoli/ensembl//Escherichia_coli_str_k_12_substr_mg1655.ASM584v2.cdna.all.fa ']'
+ BT2_DIR=/dee2/ref/ecoli/ensembl//bowtie2
+ '[' '!' -d /dee2/ref/ecoli/ensembl//bowtie2 ']'
++ basename /dee2/ref/ecoli/ensembl//Escherichia_coli_str_k_12_substr_mg1655.ASM584v2.cdna.all.fa
+ BT2_REF=/dee2/ref/ecoli/ensembl//bowtie2/Escherichia_coli_str_k_12_substr_mg1655.ASM584v2.cdna.all.fa
+ '[' -z /dee2/ref/ecoli/ensembl//bowtie2/Escherichia_coli_str_k_12_substr_mg1655.ASM584v2.cdna.all.fa ']'
+ '[' '!' -r /dee2/ref/ecoli/ensembl//bowtie2/Escherichia_coli_str_k_12_substr_mg1655.ASM584v2.cdna.all.fa ']'
+ KAL_DIR=/dee2/ref/ecoli/ensembl//kallisto
+ '[' '!' -d /dee2/ref/ecoli/ensembl//kallisto ']'
++ basename /dee2/ref/ecoli/ensembl//Escherichia_coli_str_k_12_substr_mg1655.ASM584v2.cdna.all.fa
+ KAL_REF=/dee2/ref/ecoli/ensembl//kallisto/Escherichia_coli_str_k_12_substr_mg1655.ASM584v2.cdna.all.fa.idx
+ '[' -z /dee2/ref/ecoli/ensembl//kallisto/Escherichia_coli_str_k_12_substr_mg1655.ASM584v2.cdna.all.fa.idx ']'
+ '[' '!' -r /dee2/ref/ecoli/ensembl//kallisto/Escherichia_coli_str_k_12_substr_mg1655.ASM584v2.cdna.all.fa.idx ']'
+ STAR_DIR=/dee2/ref/ecoli/ensembl//star
+ '[' '!' -d /dee2/ref/ecoli/ensembl//star ']'
+ '[' '!' -r /dee2/ref/ecoli/ensembl//star/SA ']'
+ '[' '!' -r /dee2/ref/ecoli/ensembl//star/SAindex ']'
+ '[' '!' -d /dee2/data/ecoli ']'
+ cd /dee2/data/ecoli
+ '[' ACCESSION '!=' FASTQ ']'
+ mkdir SRR27386505
+ cp /dee2/code/volunteer_pipeline.sh SRR27386505
+ cd SRR27386505
+ echo 'Starting /dee2/code/volunteer_pipeline.sh SRR27386505
    current disk space = 990270644
    free memory = 19530040 '
+ tee -a SRR27386505.log
Starting /dee2/code/volunteer_pipeline.sh SRR27386505
    current disk space = 990270644
    free memory = 19530040
+ ATTEMPTS=SRR27386505.attempts.txt
+ '[' -r SRR27386505.attempts.txt ']'
++ date +%Y-%m-%d:%H:%M:%S
+ DATE=2025-05-24:13:35:58
+ echo /dee2/code/volunteer_pipeline.sh bf64e7889548dca6262f0da0a1d6d9ed 2025-05-24:13:35:58
+ echo SRR27386505 check if SRA file exists and download if neccessary
SRR27386505 check if SRA file exists and download if neccessary
+ '[' ACCESSION == SRA_ARCHIVE ']'
+ '[' '!' -f SRR27386505.sra ']'
+ /usr/local/bin/prefetch -X 9999999999999 SRR27386505

2025-05-24T13:35:59 prefetch.3.0.1: Current preference is set to retrieve SRA Normalized Format files with full base quality scores.
2025-05-24T13:36:00 prefetch.3.0.1: 1) Downloading 'SRR27386505'...
2025-05-24T13:36:00 prefetch.3.0.1: SRA Normalized Format file is being retrieved, if this is different from your preference, it may be due to current file availability.
2025-05-24T13:36:00 prefetch.3.0.1:  Downloading via HTTPS...
2025-05-24T13:36:31 prefetch.3.0.1:  HTTPS download succeed
2025-05-24T13:36:32 prefetch.3.0.1:  'SRR27386505' is valid
2025-05-24T13:36:32 prefetch.3.0.1: 1) 'SRR27386505' was downloaded successfully
2025-05-24T13:36:32 prefetch.3.0.1: 'SRR27386505' has 0 unresolved dependencies
+ mv /dee2/ncbi/public/sra/SRR27386505.sra .
+ echo SRR27386505 Validate the SRA file
SRR27386505 Validate the SRA file
+ echo SRR27386505 SRAfilesize
+ tee -a SRR27386505.log
SRR27386505 SRAfilesize
+ md5sum SRR27386505.sra
+ tee -a SRR27386505.log
57aa8ef6910f25c78d9d5c503ba89fe1  SRR27386505.sra
++ /usr/local/bin/vdb-validate SRR27386505.sra
++ head -4
++ awk '{print $NF}'
++ grep -c ok
+ VALIDATE_SRA=4
+ '[' 4 -eq 4 ']'
+ echo SRR27386505.sra file validated
+ tee -a SRR27386505.log
SRR27386505.sra file validated
+ echo SRR27386505 diagnose basespace colorspace, single/paired-end and read length
SRR27386505 diagnose basespace colorspace, single/paired-end and read length
+ /usr/local/bin/fastq-dump -X 4000 --split-files SRR27386505.sra
Read 4000 spots for SRR27386505.sra
Written 4000 spots for SRR27386505.sra
++ ls
++ grep SRR27386505
++ grep -v trimmed.fastq
++ grep -c 'fastq$'
+ NUM_FQ=2
+ '[' 2 -eq 1 ']'
+ '[' 2 -eq 2 ']'
+ ORIG_RDS=PE
+ RDS=PE
+ echo SRR27386505 is paired end
+ tee -a SRR27386505.log
SRR27386505 is paired end
++ ls
++ grep SRR27386505
++ grep -m1 'fastq$'
+ FQ1=SRR27386505_1.fastq
+ echo

+ echo Starting FastQC analysis of SRR27386505_1.fastq
Starting FastQC analysis of SRR27386505_1.fastq
+ /usr/local/bin/fastqc -t 8 SRR27386505_1.fastq
Started analysis of SRR27386505_1.fastq
Approx 25% complete for SRR27386505_1.fastq
Approx 50% complete for SRR27386505_1.fastq
Approx 75% complete for SRR27386505_1.fastq
Approx 100% complete for SRR27386505_1.fastq
Analysis complete for SRR27386505_1.fastq
++ basename SRR27386505_1.fastq .fastq
+ FQ1BASE=SRR27386505_1
++ unzip -p SRR27386505_1_fastqc SRR27386505_1_fastqc/fastqc_data.txt
++ grep 'File type'
++ cut -f2
++ awk '{print $1}'
+ BASECALL_ENCODING=Conventional
+ '[' Conventional == Colorspace ']'
+ '[' Conventional == Conventional ']'
+ CSPACE=FALSE
+ echo SRR27386505 is conventional basespace
+ tee -a SRR27386505.log
SRR27386505 is conventional basespace
++ unzip -p SRR27386505_1_fastqc SRR27386505_1_fastqc/fastqc_data.txt
++ grep -wm1 '^Encoding'
++ cut -f2
++ tr -d ' '
+ QUALITY_ENCODING=Sanger/Illumina1.9
++ unzip -p SRR27386505_1_fastqc.zip SRR27386505_1_fastqc/fastqc_data.txt
++ grep 'Sequence length'
++ cut -f2
+ FQ1_LEN=150
+ echo SRR27386505 read1 length is 150 nt
+ tee -a SRR27386505.log
SRR27386505 read1 length is 150 nt
+ unzip -p SRR27386505_1_fastqc.zip SRR27386505_1_fastqc/fastqc_data.txt
+ rm SRR27386505_1_fastqc.zip SRR27386505_1_fastqc.html
++ sed -n 2~4p SRR27386505_1.fastq
++ awk '{print length($1)}'
++ sort -g
++ head -1
+ FQ1_MIN_LEN=150
++ sed -n 2~4p SRR27386505_1.fastq
++ awk '{print length($1)}'
++ /usr/bin/numaverage -M
+ FQ1_MEDIAN_LEN=150
++ sed -n 2~4p SRR27386505_1.fastq
++ awk '{print length($1)}'
++ sort -gr
++ head -1
+ FQ1_MAX_LEN=150
+ FQ2_MIN_LEN=NULL
+ FQ2_MEDIAN_LEN=NULL
+ FQ2_MAX_LEN=NULL
+ '[' PE == PE ']'
++ ls
++ grep SRR27386505
++ grep 'fastq$'
++ sed -n 2p
+ FQ2=SRR27386505_2.fastq
+ echo

+ echo Starting FastQC analysis of SRR27386505_2.fastq
Starting FastQC analysis of SRR27386505_2.fastq
+ /usr/local/bin/fastqc -t 8 SRR27386505_2.fastq
Started analysis of SRR27386505_2.fastq
Approx 25% complete for SRR27386505_2.fastq
Approx 50% complete for SRR27386505_2.fastq
Approx 75% complete for SRR27386505_2.fastq
Approx 100% complete for SRR27386505_2.fastq
Analysis complete for SRR27386505_2.fastq
++ basename SRR27386505_2.fastq .fastq
+ FQ2BASE=SRR27386505_2
++ unzip -p SRR27386505_2_fastqc.zip SRR27386505_2_fastqc/fastqc_data.txt
++ grep 'Sequence length'
++ cut -f2
+ FQ2_LEN=150
+ echo SRR27386505 read2 length is 150 nt
+ tee -a SRR27386505.log
SRR27386505 read2 length is 150 nt
+ unzip -p SRR27386505_2_fastqc.zip SRR27386505_2_fastqc/fastqc_data.txt
+ rm SRR27386505_2_fastqc.zip SRR27386505_2_fastqc.html
++ sed -n 2~4p SRR27386505_2.fastq
++ awk '{print length($1)}'
++ sort -g
++ head -1
+ FQ2_MIN_LEN=150
++ sed -n 2~4p SRR27386505_2.fastq
++ awk '{print length($1)}'
++ /usr/bin/numaverage -M
+ FQ2_MEDIAN_LEN=150
++ sed -n 2~4p SRR27386505_2.fastq
++ awk '{print length($1)}'
++ sort -gr
++ head -1
+ FQ2_MAX_LEN=150
+ [[ 150 -lt 20 ]]
+ '[' FALSE == TRUE ']'
+ echo SRR27386505 Dump the fastq file
SRR27386505 Dump the fastq file
+ rm SRR27386505_1.fastq SRR27386505_2.fastq
+ '[' FALSE == FALSE ']'
+ /usr/local/bin/parallel-fastq-dump --threads 8 --outdir . --split-files --defline-qual + -s SRR27386505.sra
+ '[' PE == PE ']'
+ [[ 150 -ge 20 ]]
+ [[ 150 -lt 20 ]]
+ [[ 150 -lt 20 ]]
++ du -s SRR27386505_1.fastq
++ cut -f1
+ FILESIZE=3907012
+ FILESIZE=3907012
+ echo SRR27386505 file size 3907012
+ tee -a SRR27386505.log
SRR27386505 file size 3907012
+ rm SRR27386505.sra
+ '[' 3907012 -eq 0 ']'
+ echo SRR27386505 completed basic pipeline successfully
+ tee -a SRR27386505.log
SRR27386505 completed basic pipeline successfully
+ echo SRR27386505 Quality trimming
SRR27386505 Quality trimming
+ '[' PE == SE ']'
+ '[' PE == PE ']'
+ /usr/local/bin/skewer -f sanger -l 18 -q 10 -k inf -t 8 -o SRR27386505 SRR27386505_1.fastq SRR27386505_2.fastq
.--. .-.
: .--': :.-.
`. `. : `'.' .--. .-..-..-. .--. .--.
_`, :: . `.' '_.': `; `; :' '_.': ..'
`.__.':_;:_;`.__.'`.__.__.'`.__.':_;
skewer v0.2.2 [April 4, 2016]
Parameters used:
-- 3' end adapter sequence (-x):        AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC
-- paired 3' end adapter sequence (-y): AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTA
-- maximum error ratio allowed (-r):    0.100
-- maximum indel error ratio allowed (-d):      0.030
-- end quality threshold (-q):          10
-- minimum read length allowed after trimming (-l):     18
-- file format (-f):            Sanger/Illumina 1.8+ FASTQ
-- number of concurrent threads (-t):   8
Sat May 24 13:38:32 2025 >> started
|=================================================>| (100.00%)
Sat May 24 13:39:15 2025 >> done (43.802s)
10628436 read pairs processed; of these:
      68 ( 0.00%) short read pairs filtered out after trimming by size control
      23 ( 0.00%) empty read pairs filtered out after trimming by size control
10628345 (100.00%) read pairs available; of these:
 1762375 (16.58%) trimmed read pairs available after processing
 8865970 (83.42%) untrimmed read pairs available after processing
log has been saved to "SRR27386505-trimmed.log".
+ rm SRR27386505_1.fastq SRR27386505_2.fastq
+ FQ1=SRR27386505-trimmed-pair1.fastq
+ FQ2=SRR27386505-trimmed-pair2.fastq
+ '[' '!' -f SRR27386505-trimmed.log ']'
++ grep 'processed; of these:' SRR27386505-trimmed.log
++ awk '{print $1}'
+ READ_CNT_TOTAL=10628436
++ grep 'available; of these:' SRR27386505-trimmed.log
++ awk '{print $1}'
+ READ_CNT_AVAIL=10628345
+ '[' -z 10628345 ']'
+ cat SRR27386505-trimmed.log
+ rm SRR27386505-trimmed.log
+ '[' 10628345 -eq 0 ']'
+ echo 10628345 reads passed initial QC
+ tee -a SRR27386505.log
10628345 reads passed initial QC
+ echo SRR27386505 adapter diagnosis
SRR27386505 adapter diagnosis
+ ADAPTER_THRESHOLD=2
+ '[' PE == SE ']'
+ '[' PE == PE ']'
+ MINION_LOG=SRR27386505-trimmed-pair1.fastq.minion.log
+ /usr/local/bin/minion search-adapter -i SRR27386505-trimmed-pair1.fastq
[minion] reading reads
.................................................  1
.................................................  2
[minion] connected component analysis
[minion] building consensus sequences
++ head SRR27386505-trimmed-pair1.fastq.minion.log
++ grep -m1 sequence=
++ cut -d = -f2
+ ADAPTER1=AACACGTCTTTAGAGCCATCGTCAGGAGTGATGAAGCCGAAGCCTTTGTCAGCGTTGAACCATTTTACGATACCAGTCATTTTACCGGACATAGTGTATTACCTTTAATAATTAAGTGTGCCTTTCGGCGATATGGCGTGCTTTACAGATTTTGAAGCGTTAAAGGAATGTGCACTACGAGGGGTATCAACGATAACTCTTGAAGGGACTTGCCTTACTACACTGCTTTAATGGTCTGTACG
++ head SRR27386505-trimmed-pair1.fastq.minion.log
++ grep -m1 sequence-density=
++ cut -d = -f2
++ /usr/bin/numround -c
+ DENSITY1=1
+ cat SRR27386505-trimmed-pair1.fastq.minion.log
+ tee -a SRR27386505.log


criterion=sequence-density
sequence-density=0.54
sequence-density-rank=1
fanout-score=1.94
fanout-score-rank=44
prefix-density=1.04
prefix-fanout=1.0
sequence=AACACGTCTTTAGAGCCATCGTCAGGAGTGATGAAGCCGAAGCCTTTGTCAGCGTTGAACCATTTTACGATACCAGTCATTTTACCGGACATAGTGTATTACCTTTAATAATTAAGTGTGCCTTTCGGCGATATGGCGTGCTTTACAGATTTTGAAGCGTTAAAGGAATGTGCACTACGAGGGGTATCAACGATAACTCTTGAAGGGACTTGCCTTACTACACTGCTTTAATGGTCTGTACG


criterion=fanout-score
sequence-density=0.08
sequence-density-rank=31
fanout-score=145.06
fanout-score-rank=1
prefix-density=0.90
prefix-fanout=12.2
sequence=TTCAGCCAGAATGGTTGCCGCACGACGAATCGCCTCTTCAGGATCGATTGTGCCGTTGGTTTCCATTTCGATGACCAGCTTGTCCAGGTCGGTACGCTGTTCTACACGCGCTGCTTCAACATTGTAGGCAATACGCTCCACAGGGCTGTAGCATGCGTCGACCAGCAGACGGCCGATTGGGCGCTCATCTTCTTCCGAATGAATTCGGGTAGAAGCCGGCACATAACCACGACCGCGCTGAACTTTGATACGCATGCTAATAGACGCGTTCTCATCGGTCAGGTGGCAGATCACGTGCTGCGGCTTGACGATTTCGACATCACCGTCGTGGGTGATATCGGCTGCAGTCACAGGGCCAATGCCAGATTTATTCAAGGTAAGAATAACTTCATCTTTGCCCTGAACTCTCACCGCCAGCCCTTTCAGGTTGAGCAGGATTTCCAGGATATCTT
+ rm SRR27386505-trimmed-pair1.fastq.minion.log
+ MINION_LOG=SRR27386505-trimmed-pair2.fastq.minion.log
+ /usr/local/bin/minion search-adapter -i SRR27386505-trimmed-pair2.fastq
[minion] reading reads
.................................................  1
.................................................  2
[minion] connected component analysis
[minion] building consensus sequences
++ head SRR27386505-trimmed-pair2.fastq.minion.log
++ grep -m1 sequence=
++ cut -d = -f2
+ ADAPTER2=AGGTTCGAATCCT
++ head SRR27386505-trimmed-pair2.fastq.minion.log
++ grep -m1 sequence-density=
++ cut -d = -f2
++ /usr/bin/numround -c
+ DENSITY2=1
+ cat SRR27386505-trimmed-pair2.fastq.minion.log
+ tee -a SRR27386505.log


criterion=sequence-density
sequence-density=0.38
sequence-density-rank=1
fanout-score=3.64
fanout-score-rank=29
prefix-density=0.40
prefix-fanout=3.5
sequence=AGGTTCGAATCCT


criterion=fanout-score
sequence-density=0.01
sequence-density-rank=34
fanout-score=83.69
fanout-score-rank=1
prefix-density=0.13
prefix-fanout=6.1
sequence=TGGTGGTTGAATACCCGGCGTAATGTTAACCGTCTTGCGATAACAGGTCGCTACGAGTAGAATACTGCCGCTTAACGTCGCGTAAATTGTTTAACACTTTGCGTAACGTACACTGGGATCGCTGAATTAGAGATCGGCGTCCTTTCATTCTATATACTTTGGAGTTTTAAAATGTCTCTAAGTACTGAAGCAACAGCTAAAATCGTTTCTGAGTTTGGTCGTGACGCAAACGACACCGGTTCTACCGAAGTTCAGGTAGCACTGCTGACTGCACAGATC
+ rm SRR27386505-trimmed-pair2.fastq.minion.log
++ echo 1 1
++ awk '{print ($1+$2)/2}'
++ /usr/bin/numround
+ DENSITY=1
+ [[ ! -z 1 ]]
+ '[' 1 -gt 2 ']'
++ echo 10628345 10628436
++ awk '{print $1/$2*100"%"}'
+ QC_PASS_RATE=99.9991%
++ du -s SRR27386505-trimmed-pair1.fastq
++ cut -f1
+ FQSIZE=3836504
+ '[' 3836504 -eq 0 ']'
+ echo SRR27386505 Starting mapping phase
SRR27386505 Starting mapping phase
+ '[' PE == PE ']'
+ echo SRR27386505 testing PE reads STAR mapping to Ensembl genome
+ tee -a SRR27386505.log
SRR27386505 testing PE reads STAR mapping to Ensembl genome
+ head SRR27386505-trimmed-pair1.fastq SRR27386505-trimmed-pair2.fastq
==> SRR27386505-trimmed-pair1.fastq <==
@SRR27386505.1 A00202:1350:HH7CFDSX7:1:1101:26756:1016 length=150
GAAAGAGTCACATCTTAACGGTGAAGCTGAAGTAGAAAAACGTGTTACAGCATCAGTTGGCTCGTGGATCAAGCGACTCAATAGTTGGCTGCGAAAAGAGTTTTAATTTTTATTAGGCCGACGATGATTACGGCCTCAGGCGACAGGC
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
@SRR27386505.2 A00202:1350:HH7CFDSX7:1:1101:27606:1016 length=150
GNGTTGAACCATTTTACGATACCAGTCATTTTACCGGACATAGTGTATTACCTTTAATAATTAAGTGTGCCTTTCGGCGATATGGCGTGCTTTACAGATTTTGAAGCGTTAAAGGAATGTGCACTACGAGGGGTATCAACGATAACTCTT
+
F#FFFFFFFFFFFFFFFFFFFFFFFFFF,FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
@SRR27386505.3 A00202:1350:HH7CFDSX7:1:1101:30481:1016 length=150
GNTTGTTCACACCAGATTAATGAACAGTTCTGGGAGAATGCGGTGACCGGAATAATACGATAGTTCATACTGCCCCTGTTTCGTTAAGCAATTACTACCAGTGCCGTGCTGGCCCGGTATCAATATGCACAAAGTTACTACGTGGATAAT

==> SRR27386505-trimmed-pair2.fastq <==
@SRR27386505.1 A00202:1350:HH7CFDSX7:1:1101:26756:1016 length=150
GCCTGTCGCCTGAGGCCGTAATCATCGTCGGCCTCATAAAAATTAAAACTCTTTTCGCAGCCAACTATTGAGTCGCTTGATCCACGAGCCAACTGATGCTGTAACAAGTTTTTCTACATCAGCTTCACCGTTAAGATGTGACTCTTTA
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF,FFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFF,FFF:F:FF:FF,:FF:FF:FFFFF,,FFFFF,FF,,F:F,:,F:FFF:::FF,F,FF,F,FFF:F,
@SRR27386505.2 A00202:1350:HH7CFDSX7:1:1101:27606:1016 length=150
CGTACAGACCATTAAAGCAGTGTAGTAAGGCAAGTCCCTTCAAGAGTTATCGTTGATACCCCTCGTAGTGCACATTCCTTTAACGCTTCAAAATCTGTAAAGCACGCCATATCGCCGAAAGGCACACTTAATTATTAAAGGTAATACACT
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:F:FFFFFF,FFFFFFFFFFFFFFFFF:FFFFFFF,FFFFFFFFFFFFFFFFFFFFFFFFFFFF,F,FF:FF,FFFFFFFFFFFF,
@SRR27386505.3 A00202:1350:HH7CFDSX7:1:1101:30481:1016 length=150
GATTTCCATATTGAAGGTATCGCGTTAAGCAATATTCGCAAAGCCGCGTTATCTATGCGCGCAGGTGGTGTAGGATATTATCCACGTAGTAACTTTGTGCATATTGATACCGGGCCAGCACGGCACTGGTAGTAATTGCTTAACGAAACA
+ tail SRR27386505-trimmed-pair1.fastq SRR27386505-trimmed-pair2.fastq
==> SRR27386505-trimmed-pair1.fastq <==
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
@SRR27386505.10628435 A00202:1350:HH7CFDSX7:1:2678:1579:37043 length=150
GTCGCTTAAAATACGCAGGCCCGTGATTGCCCATTTGGTGCAGCATGATCAGCATATCTTTGCCGTTATTGGCAGCGACAAAGTCATCTAAGCCAACGAGCATACCGACATCGCGGCATTCGTTATAAGGATTGGTGTTGCAGATGGCGT
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFF:FFFFFFFFFFFFFFFFFF:FFFFF,FFFFFFF::FFFFFFFFFFFFFFFF,
@SRR27386505.10628436 A00202:1350:HH7CFDSX7:1:2678:5014:37043 length=150
GGGAGGAATAAAAAAAACCTTACAATCACTGTAGAAATTCTTTTATACAGCTAATTGATGTGGTCTTTTACTCCTTTCTATAACCTTTTGTCAACTTTAACAAAAGTTTCTTCACATTAGTTTACATAATATCAACACCATTAGCATTTA
+
FFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFF,FFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF

==> SRR27386505-trimmed-pair2.fastq <==
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFF,FFFFFFFFFF:F:F,F:FF:FFFFFFFFFFFFFFFFFFFFFFFFFF
@SRR27386505.10628435 A00202:1350:HH7CFDSX7:1:2678:1579:37043 length=150
GCGCAATTTGCCGATTATAAATCCGCGACCAACAACGCCATCTGCAACACCAATCCTTATAACGAATGCCGCGATGTCGGTATGCTCGTTGGCTTAGATGACTTTGTCGCTGCCAATAACGGCAAAGATATGCTGATCATGCTGCACCAA
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFF
@SRR27386505.10628436 A00202:1350:HH7CFDSX7:1:2678:5014:37043 length=150
GCATTAAATGCTAATGGTGTTGATATTATGTAAACTAATGTGAAGAAACTTTTGTTAAAGTTGACAAAAGGTTATAGAAAGGAGTAAAAGACCACATCAATTAGCTGTATAAAAGAATTTATACAGTGATTGTAAGGTTTTTTTTATTCC
+
FFFFFFFFFFFFFFFF:FFFFF:FFFFFFFFFFFFFFFFFFF:FF,FFFFFFFFFFF::FFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFF:FFF:FFFFFFFFFFFFFFFFFFFFFF,FF:F:FFFFFFFFFFF:FFFFFFFFFF,F
+ head -10000 SRR27386505-trimmed-pair1.fastq
+ head -1000000 SRR27386505-trimmed-pair1.fastq
+ tail -90000
+ head -10000 SRR27386505-trimmed-pair2.fastq
+ head -1000000 SRR27386505-trimmed-pair2.fastq
+ tail -90000
+ /usr/bin/fastx_trimmer -f 5 -m 18 -Q 33 -i test_R1.fq
+ /usr/bin/fastx_trimmer -f 5 -m 18 -Q 33 -i test_R2.fq
+ /usr/bin/fastx_trimmer -f 9 -m 18 -Q 33 -i test_R1.fq
+ /usr/bin/fastx_trimmer -f 9 -m 18 -Q 33 -i test_R2.fq
+ /usr/bin/fastx_trimmer -f 13 -m 18 -Q 33 -i test_R1.fq
+ /usr/bin/fastx_trimmer -f 13 -m 18 -Q 33 -i test_R2.fq
+ /usr/bin/fastx_trimmer -f 21 -m 18 -Q 33 -i test_R1.fq
+ wait
+ /usr/bin/fastx_trimmer -f 21 -m 18 -Q 33 -i test_R2.fq
+ /usr/local/bin/STAR --runThreadN 8 --quantMode GeneCounts --genomeLoad LoadAndKeep --outSAMtype None --genomeDir /dee2/ref/ecoli/ensembl//star --readFilesIn=test_R1.fq
/dee2/code/volunteer_pipeline.sh: line 140:   357 Segmentation fault      (core dumped) $STAR --runThreadN $THREADS --quantMode GeneCounts --genomeLoad LoadAndKeep --outSAMtype None --genomeDir $STAR_DIR --readFilesIn=test_R1.fq > /dev/null 2>&1
++ sed -n 2~4p
++ wc -l
+ R1_RD_CNT=25000
++ cut -f2 ReadsPerGene.out.tab
++ tail -n +3
++ /usr/bin/numsum
cut: ReadsPerGene.out.tab: No such file or directory
+ MAPPED_CNT=0
++ cut -f2 ReadsPerGene.out.tab
++ head -1
cut: ReadsPerGene.out.tab: No such file or directory
+ UNMAPPED_CNT=
++ echo 0 25000
++ awk '{print $1/$2*100}'
++ /usr/bin/numround
+ R1_MAP_RATE=0
+ /usr/local/bin/STAR --runThreadN 8 --quantMode GeneCounts --genomeLoad LoadAndKeep --outSAMtype None --genomeDir /dee2/ref/ecoli/ensembl//star --readFilesIn=test_R2.fq
/dee2/code/volunteer_pipeline.sh: line 140:   372 Segmentation fault      (core dumped) $STAR --runThreadN $THREADS --quantMode GeneCounts --genomeLoad LoadAndKeep --outSAMtype None --genomeDir $STAR_DIR --readFilesIn=test_R2.fq > /dev/null 2>&1
++ sed -n 2~4p
++ wc -l
+ R2_RD_CNT=25000
++ cut -f2 ReadsPerGene.out.tab
++ tail -n +3
++ /usr/bin/numsum
cut: ReadsPerGene.out.tab: No such file or directory
+ MAPPED_CNT=0
++ cut -f2 ReadsPerGene.out.tab
++ head -1
cut: ReadsPerGene.out.tab: No such file or directory
+ UNMAPPED_CNT=
++ echo 0 25000
++ awk '{print $1/$2*100}'
++ /usr/bin/numround
+ R2_MAP_RATE=0
+ /usr/local/bin/STAR --runThreadN 8 --quantMode GeneCounts --genomeLoad LoadAndKeep --outSAMtype None --genomeDir /dee2/ref/ecoli/ensembl//star --readFilesIn=test_R1_clip4.fq
/dee2/code/volunteer_pipeline.sh: line 140:   387 Segmentation fault      (core dumped) $STAR --runThreadN $THREADS --quantMode GeneCounts --genomeLoad LoadAndKeep --outSAMtype None --genomeDir $STAR_DIR --readFilesIn=test_R1_clip4.fq > /dev/null 2>&1
++ cut -f2 ReadsPerGene.out.tab
++ tail -n +3
++ /usr/bin/numsum
cut: ReadsPerGene.out.tab: No such file or directory
+ MAPPED_CNT=0
++ echo 0 25000
++ awk '{print ($1/$2*100)-1}'
++ /usr/bin/numround
+ R1_MAP_RATE_CLIP4=-1
+ /usr/local/bin/STAR --runThreadN 8 --quantMode GeneCounts --genomeLoad LoadAndKeep --outSAMtype None --genomeDir /dee2/ref/ecoli/ensembl//star --readFilesIn=test_R2_clip4.fq
/dee2/code/volunteer_pipeline.sh: line 140:   396 Segmentation fault      (core dumped) $STAR --runThreadN $THREADS --quantMode GeneCounts --genomeLoad LoadAndKeep --outSAMtype None --genomeDir $STAR_DIR --readFilesIn=test_R2_clip4.fq > /dev/null 2>&1
++ cut -f2 ReadsPerGene.out.tab
++ tail -n +3
++ /usr/bin/numsum
cut: ReadsPerGene.out.tab: No such file or directory
+ MAPPED_CNT=0
++ echo 0 25000
++ awk '{print ($1/$2*100)-1}'
++ /usr/bin/numround
+ R2_MAP_RATE_CLIP4=-1
+ /usr/local/bin/STAR --runThreadN 8 --quantMode GeneCounts --genomeLoad LoadAndKeep --outSAMtype None --genomeDir /dee2/ref/ecoli/ensembl//star --readFilesIn=test_R1_clip8.fq
/dee2/code/volunteer_pipeline.sh: line 140:   405 Segmentation fault      (core dumped) $STAR --runThreadN $THREADS --quantMode GeneCounts --genomeLoad LoadAndKeep --outSAMtype None --genomeDir $STAR_DIR --readFilesIn=test_R1_clip8.fq > /dev/null 2>&1
++ cut -f2 ReadsPerGene.out.tab
++ tail -n +3
++ /usr/bin/numsum
cut: ReadsPerGene.out.tab: No such file or directory
+ MAPPED_CNT=0
++ echo 0 25000
++ awk '{print ($1/$2*100)-1}'
++ /usr/bin/numround
+ R1_MAP_RATE_CLIP8=-1
+ /usr/local/bin/STAR --runThreadN 8 --quantMode GeneCounts --genomeLoad LoadAndKeep --outSAMtype None --genomeDir /dee2/ref/ecoli/ensembl//star --readFilesIn=test_R2_clip8.fq
/dee2/code/volunteer_pipeline.sh: line 140:   414 Segmentation fault      (core dumped) $STAR --runThreadN $THREADS --quantMode GeneCounts --genomeLoad LoadAndKeep --outSAMtype None --genomeDir $STAR_DIR --readFilesIn=test_R2_clip8.fq > /dev/null 2>&1
++ cut -f2 ReadsPerGene.out.tab
++ tail -n +3
++ /usr/bin/numsum
cut: ReadsPerGene.out.tab: No such file or directory
+ MAPPED_CNT=0
++ echo 0 25000
++ awk '{print ($1/$2*100)-1}'
++ /usr/bin/numround
+ R2_MAP_RATE_CLIP8=-1
+ /usr/local/bin/STAR --runThreadN 8 --quantMode GeneCounts --genomeLoad LoadAndKeep --outSAMtype None --genomeDir /dee2/ref/ecoli/ensembl//star --readFilesIn=test_R1_clip12.fq
/dee2/code/volunteer_pipeline.sh: line 140:   423 Segmentation fault      (core dumped) $STAR --runThreadN $THREADS --quantMode GeneCounts --genomeLoad LoadAndKeep --outSAMtype None --genomeDir $STAR_DIR --readFilesIn=test_R1_clip12.fq > /dev/null 2>&1
++ cut -f2 ReadsPerGene.out.tab
++ tail -n +3
++ /usr/bin/numsum
cut: ReadsPerGene.out.tab: No such file or directory
+ MAPPED_CNT=0
++ echo 0 25000
++ awk '{print ($1/$2*100)-1}'
++ /usr/bin/numround
+ R1_MAP_RATE_CLIP12=-1
+ /usr/local/bin/STAR --runThreadN 8 --quantMode GeneCounts --genomeLoad LoadAndKeep --outSAMtype None --genomeDir /dee2/ref/ecoli/ensembl//star --readFilesIn=test_R2_clip12.fq
/dee2/code/volunteer_pipeline.sh: line 140:   432 Segmentation fault      (core dumped) $STAR --runThreadN $THREADS --quantMode GeneCounts --genomeLoad LoadAndKeep --outSAMtype None --genomeDir $STAR_DIR --readFilesIn=test_R2_clip12.fq > /dev/null 2>&1
++ cut -f2 ReadsPerGene.out.tab
++ tail -n +3
++ /usr/bin/numsum
cut: ReadsPerGene.out.tab: No such file or directory
+ MAPPED_CNT=0
++ echo 0 25000
++ awk '{print ($1/$2*100)-1}'
++ /usr/bin/numround
+ R2_MAP_RATE_CLIP12=-1
+ /usr/local/bin/STAR --runThreadN 8 --quantMode GeneCounts --genomeLoad LoadAndKeep --outSAMtype None --genomeDir /dee2/ref/ecoli/ensembl//star --readFilesIn=test_R1_clip20.fq
/dee2/code/volunteer_pipeline.sh: line 140:   441 Segmentation fault      (core dumped) $STAR --runThreadN $THREADS --quantMode GeneCounts --genomeLoad LoadAndKeep --outSAMtype None --genomeDir $STAR_DIR --readFilesIn=test_R1_clip20.fq > /dev/null 2>&1
++ cut -f2 ReadsPerGene.out.tab
++ tail -n +3
++ /usr/bin/numsum
cut: ReadsPerGene.out.tab: No such file or directory
+ MAPPED_CNT=0
++ echo 0 25000
++ awk '{print ($1/$2*100)-1}'
++ /usr/bin/numround
+ R1_MAP_RATE_CLIP20=-1
+ /usr/local/bin/STAR --runThreadN 8 --quantMode GeneCounts --genomeLoad LoadAndKeep --outSAMtype None --genomeDir /dee2/ref/ecoli/ensembl//star --readFilesIn=test_R2_clip20.fq
/dee2/code/volunteer_pipeline.sh: line 140:   450 Segmentation fault      (core dumped) $STAR --runThreadN $THREADS --quantMode GeneCounts --genomeLoad LoadAndKeep --outSAMtype None --genomeDir $STAR_DIR --readFilesIn=test_R2_clip20.fq > /dev/null 2>&1
++ cut -f2 ReadsPerGene.out.tab
++ tail -n +3
++ /usr/bin/numsum
cut: ReadsPerGene.out.tab: No such file or directory
+ MAPPED_CNT=0
++ echo 0 25000
++ awk '{print ($1/$2*100)-1}'
++ /usr/bin/numround
+ R2_MAP_RATE_CLIP20=-1
+ rm test_R1.fq test_R2.fq test_R1_clip4.fq test_R2_clip4.fq test_R1_clip8.fq test_R2_clip8.fq test_R1_clip12.fq test_R2_clip12.fq test_R1_clip20.fq test_R2_clip20.fq ReadsPerGene.out.tab
rm: cannot remove 'ReadsPerGene.out.tab': No such file or directory
++ echo 0:0 -1:4 -1:8 -1:12 -1:20
++ tr ' ' '\n'
++ sort -gr
++ head -1
++ cut -d : -f2
+ R1_CLIP_NUM=0
++ echo 0:0 -1:4 -1:8 -1:12 -1:20
++ tr ' ' '\n'
++ sort -gr
++ head -1
++ cut -d : -f1
+ R1_MAP_RATE=0
++ echo 0:0 -1:4 -1:8 -1:12 -1:20
++ tr ' ' '\n'
++ sort -gr
++ head -1
++ cut -d : -f2
+ R2_CLIP_NUM=0
++ echo 0:0 -1:4 -1:8 -1:12 -1:20
++ tr ' ' '\n'
++ sort -gr
++ head -1
++ cut -d : -f1
+ R2_MAP_RATE=0
+ [[ 0 -gt 0 ]]
+ [[ 0 -gt 0 ]]
+ R1R2_DIFF=0
+ '[' 0 -lt 40 -a 0 -ge 20 ']'
+ R2R1_DIFF=0
+ '[' 0 -lt 40 -a 0 -ge 20 ']'
+ [[ 0 -gt 15 ]]
+ [[ 0 -gt 15 ]]
+ '[' PE == SE ']'
+ '[' PE == PE ']'
+ head SRR27386505-trimmed-pair1.fastq SRR27386505-trimmed-pair2.fastq
==> SRR27386505-trimmed-pair1.fastq <==
@SRR27386505.1 A00202:1350:HH7CFDSX7:1:1101:26756:1016 length=150
GAAAGAGTCACATCTTAACGGTGAAGCTGAAGTAGAAAAACGTGTTACAGCATCAGTTGGCTCGTGGATCAAGCGACTCAATAGTTGGCTGCGAAAAGAGTTTTAATTTTTATTAGGCCGACGATGATTACGGCCTCAGGCGACAGGC
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
@SRR27386505.2 A00202:1350:HH7CFDSX7:1:1101:27606:1016 length=150
GNGTTGAACCATTTTACGATACCAGTCATTTTACCGGACATAGTGTATTACCTTTAATAATTAAGTGTGCCTTTCGGCGATATGGCGTGCTTTACAGATTTTGAAGCGTTAAAGGAATGTGCACTACGAGGGGTATCAACGATAACTCTT
+
F#FFFFFFFFFFFFFFFFFFFFFFFFFF,FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
@SRR27386505.3 A00202:1350:HH7CFDSX7:1:1101:30481:1016 length=150
GNTTGTTCACACCAGATTAATGAACAGTTCTGGGAGAATGCGGTGACCGGAATAATACGATAGTTCATACTGCCCCTGTTTCGTTAAGCAATTACTACCAGTGCCGTGCTGGCCCGGTATCAATATGCACAAAGTTACTACGTGGATAAT

==> SRR27386505-trimmed-pair2.fastq <==
@SRR27386505.1 A00202:1350:HH7CFDSX7:1:1101:26756:1016 length=150
GCCTGTCGCCTGAGGCCGTAATCATCGTCGGCCTCATAAAAATTAAAACTCTTTTCGCAGCCAACTATTGAGTCGCTTGATCCACGAGCCAACTGATGCTGTAACAAGTTTTTCTACATCAGCTTCACCGTTAAGATGTGACTCTTTA
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF,FFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFF,FFF:F:FF:FF,:FF:FF:FFFFF,,FFFFF,FF,,F:F,:,F:FFF:::FF,F,FF,F,FFF:F,
@SRR27386505.2 A00202:1350:HH7CFDSX7:1:1101:27606:1016 length=150
CGTACAGACCATTAAAGCAGTGTAGTAAGGCAAGTCCCTTCAAGAGTTATCGTTGATACCCCTCGTAGTGCACATTCCTTTAACGCTTCAAAATCTGTAAAGCACGCCATATCGCCGAAAGGCACACTTAATTATTAAAGGTAATACACT
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:F:FFFFFF,FFFFFFFFFFFFFFFFF:FFFFFFF,FFFFFFFFFFFFFFFFFFFFFFFFFFFF,F,FF:FF,FFFFFFFFFFFF,
@SRR27386505.3 A00202:1350:HH7CFDSX7:1:1101:30481:1016 length=150
GATTTCCATATTGAAGGTATCGCGTTAAGCAATATTCGCAAAGCCGCGTTATCTATGCGCGCAGGTGGTGTAGGATATTATCCACGTAGTAACTTTGTGCATATTGATACCGGGCCAGCACGGCACTGGTAGTAATTGCTTAACGAAACA
+ tail SRR27386505-trimmed-pair1.fastq SRR27386505-trimmed-pair2.fastq
==> SRR27386505-trimmed-pair1.fastq <==
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
@SRR27386505.10628435 A00202:1350:HH7CFDSX7:1:2678:1579:37043 length=150
GTCGCTTAAAATACGCAGGCCCGTGATTGCCCATTTGGTGCAGCATGATCAGCATATCTTTGCCGTTATTGGCAGCGACAAAGTCATCTAAGCCAACGAGCATACCGACATCGCGGCATTCGTTATAAGGATTGGTGTTGCAGATGGCGT
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFF:FFFFFFFFFFFFFFFFFF:FFFFF,FFFFFFF::FFFFFFFFFFFFFFFF,
@SRR27386505.10628436 A00202:1350:HH7CFDSX7:1:2678:5014:37043 length=150
GGGAGGAATAAAAAAAACCTTACAATCACTGTAGAAATTCTTTTATACAGCTAATTGATGTGGTCTTTTACTCCTTTCTATAACCTTTTGTCAACTTTAACAAAAGTTTCTTCACATTAGTTTACATAATATCAACACCATTAGCATTTA
+
FFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFF,FFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF

==> SRR27386505-trimmed-pair2.fastq <==
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFF,FFFFFFFFFF:F:F,F:FF:FFFFFFFFFFFFFFFFFFFFFFFFFF
@SRR27386505.10628435 A00202:1350:HH7CFDSX7:1:2678:1579:37043 length=150
GCGCAATTTGCCGATTATAAATCCGCGACCAACAACGCCATCTGCAACACCAATCCTTATAACGAATGCCGCGATGTCGGTATGCTCGTTGGCTTAGATGACTTTGTCGCTGCCAATAACGGCAAAGATATGCTGATCATGCTGCACCAA
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFF
@SRR27386505.10628436 A00202:1350:HH7CFDSX7:1:2678:5014:37043 length=150
GCATTAAATGCTAATGGTGTTGATATTATGTAAACTAATGTGAAGAAACTTTTGTTAAAGTTGACAAAAGGTTATAGAAAGGAGTAAAAGACCACATCAATTAGCTGTATAAAAGAATTTATACAGTGATTGTAAGGTTTTTTTTATTCC
+
FFFFFFFFFFFFFFFF:FFFFF:FFFFFFFFFFFFFFFFFFF:FF,FFFFFFFFFFF::FFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFF:FFF:FFFFFFFFFFFFFFFFFFFFFF,FF:F:FFFFFFFFFFF:FFFFFFFFFF,F
+ /usr/local/bin/STAR --runThreadN 8 --quantMode GeneCounts --genomeLoad LoadAndKeep --outSAMtype None --genomeDir /dee2/ref/ecoli/ensembl//star --readFilesIn=SRR27386505-trimmed-pair1.fastq SRR27386505-trimmed-pair2.fastq
/dee2/code/volunteer_pipeline.sh: line 140:   486 Segmentation fault      (core dumped) $STAR --runThreadN $THREADS --quantMode GeneCounts --genomeLoad LoadAndKeep --outSAMtype None --genomeDir $STAR_DIR --readFilesIn=$FQ1 $FQ2
++ grep 'Uniquely mapped reads number' Log.final.out
++ awk '{print $NF}'
grep: Log.final.out: No such file or directory
+ UNIQ_MAPPED_READS=
+ cat Log.final.out
+ tee -a SRR27386505.log
cat: Log.final.out: No such file or directory
+ rm Log.final.out Log.out Log.progress.out SJ.out.tab
rm: cannot remove 'Log.final.out': No such file or directory
rm: cannot remove 'Log.out': No such file or directory
rm: cannot remove 'Log.progress.out': No such file or directory
rm: cannot remove 'SJ.out.tab': No such file or directory
+ head -4 ReadsPerGene.out.tab
+ tee -a SRR27386505.log
head: cannot open 'ReadsPerGene.out.tab' for reading: No such file or directory
+ mv ReadsPerGene.out.tab SRR27386505.se.tsv
mv: cannot stat 'ReadsPerGene.out.tab': No such file or directory
+ echo SRR27386505 diagnose strandedness now
SRR27386505 diagnose strandedness now
++ cut -f2 SRR27386505.se.tsv
++ tail -n +5
++ /usr/bin/numsum
cut: SRR27386505.se.tsv: No such file or directory
+ UNSTRANDED_CNT=0
++ cut -f3 SRR27386505.se.tsv
++ tail -n +5
++ /usr/bin/numsum
cut: SRR27386505.se.tsv: No such file or directory
+ POS_STRAND_CNT=0
++ cut -f4 SRR27386505.se.tsv
++ tail -n +5
++ /usr/bin/numsum
cut: SRR27386505.se.tsv: No such file or directory
+ NEG_STRAND_CNT=0
+ echo 'UnstrandedReadsAssigned:0 PositiveStrandReadsAssigned:0 NegativeStrandReadsAssigned:0'
+ tee -a SRR27386505.log
UnstrandedReadsAssigned:0 PositiveStrandReadsAssigned:0 NegativeStrandReadsAssigned:0
+ '[' 0 -ge 0 ']'
+ STRAND=1
+ STRANDED=PositiveStrand
+ KALLISTO_STRAND_PARAMETER=--fr-stranded
+ echo 'Dataset is classified positive stranded'
+ tee -a SRR27386505.log
Dataset is classified positive stranded
+ echo KALLISTO_STRAND_PARAMETER=--fr-stranded
KALLISTO_STRAND_PARAMETER=--fr-stranded
+ CUTCOL=3
++ cut -f3 SRR27386505.se.tsv
++ head -1
cut: SRR27386505.se.tsv: No such file or directory
+ UNMAPPED_CNT=
++ cut -f3 SRR27386505.se.tsv
++ head -2
++ tail -1
cut: SRR27386505.se.tsv: No such file or directory
+ MULTIMAPPED_CNT=
++ cut -f3 SRR27386505.se.tsv
++ head -3
++ tail -1
cut: SRR27386505.se.tsv: No such file or directory
+ NOFEATURE_CNT=
++ cut -f3 SRR27386505.se.tsv
++ head -4
++ tail -1
cut: SRR27386505.se.tsv: No such file or directory
+ AMBIGUOUS_CNT=
++ cut -f3 SRR27386505.se.tsv
++ tail -n +5
++ /usr/bin/numsum
cut: SRR27386505.se.tsv: No such file or directory
+ ASSIGNED_CNT=0
++ echo 10628345
++ awk '{print $1/$2*100"%"}'
+ UNIQ_MAP_RATE=inf%
++ echo 0 10628345
++ awk '{print $1/$2*100"%"}'
+ ASSIGNED_RATE=0%
+ CUTCOL=3
+ cut -f1,3 SRR27386505.se.tsv
+ tail -n +5
cut: SRR27386505.se.tsv: No such file or directory
+ mv SRR27386505.se.tsv.tmp SRR27386505.se.tsv
+ echo SRR27386505 checking readlengths now for kmer selection
SRR27386505 checking readlengths now for kmer selection
++ sed -n 2~4p SRR27386505-trimmed-pair1.fastq
++ head -1000000
++ awk '{print length}'
++ sort -n
++ awk '{all[NR] = $0} END{print all[int(NR*0.50 - 0.5)]}'
+ MEDIAN_LENGTH=150
++ sed -n 2~4p SRR27386505-trimmed-pair1.fastq
++ head -1000000
++ awk '{print length}'
++ sort -n
++ awk '{all[NR] = $0} END{print all[int(NR*0.20 - 0.5)]}'
+ D20=150
+ KMER=146
++ echo 146
++ awk '{print ($1+1)%2}'
+ ADJUST=1
+ KMER=145
+ '[' 145 -lt 19 ']'
+ echo MeadianReadLen=150 20thPercentileLength=150 echo kmer=145
+ tee -a SRR27386505.log
MeadianReadLen=150 20thPercentileLength=150 echo kmer=145
+ '[' 145 -lt 31 ']'
+ KMER=31
+ echo SRR27386505 running kallisto now
SRR27386505 running kallisto now
+ '[' PE == SE ']'
+ '[' PE == PE ']'
+ echo SRR27386505 Starting Kallisto paired end mapping to ensembl reference transcriptome
+ tee -a SRR27386505.log
SRR27386505 Starting Kallisto paired end mapping to ensembl reference transcriptome
+ /usr/local/bin/kallisto quant --fr-stranded -t 8 -o . -i /dee2/ref/ecoli/ensembl//kallisto/Escherichia_coli_str_k_12_substr_mg1655.ASM584v2.cdna.all.fa.idx SRR27386505-trimmed-pair1.fastq SRR27386505-trimmed-pair2.fastq
+ tee -a SRR27386505.log

[quant] fragment length distribution will be estimated from the data
[index] k-mer length: 31
[index] number of targets: 4,322
[index] number of k-mers: 3,915,872
[index] number of equivalence classes: 4,486
[quant] running in paired-end mode
[quant] will process pair 1: SRR27386505-trimmed-pair1.fastq
                             SRR27386505-trimmed-pair2.fastq
[quant] finding pseudoalignments for the reads ... done
[quant] processed 10,628,345 reads, 829,354 reads pseudoaligned
[quant] estimated average fragment length: 185.15
[   em] quantifying the abundances ... done
[   em] the Expectation-Maximization algorithm ran for 52 rounds

+ mv abundance.tsv SRR27386505.ke.tsv
+ rm abundance.h5
++ grep 'reads pseudoaligned' SRR27386505.log
++ awk '{print $(NF-2)}'
++ tr -d ,
+ PSEUDOMAPPED_CNT=829354
++ echo 829354 10628345
++ awk '{print $1/$2*100"%"}'
+ PSEUDOMAP_RATE=7.80323%
+ rm -rf run_info.json SRR27386505-trimmed-pair1.fastq SRR27386505-trimmed-pair2.fastq _STARgenome
+ wc -l SRR27386505.ke.tsv SRR27386505.se.tsv
+ tee -a SRR27386505.log
  4323 SRR27386505.ke.tsv
     0 SRR27386505.se.tsv
  4323 total
+ head SRR27386505.ke.tsv SRR27386505.se.tsv
+ tee -a SRR27386505.log
==> SRR27386505.ke.tsv <==
target_id       length  eff_length      est_counts      tpm
AAC73112        66      10.7059 23      680.888
AAC73113        2463    2278.85 181     25.1729
AAC73114        933     748.85  55      23.2776
AAC73115        1287    1102.85 100     28.7378
AAC73116        297     123.557 86      220.598
AAC73117        777     592.85  50      26.7298
AAC73118        1431    1246.85 195     49.5668
AAC73119        954     769.85  790     325.23
AAC73120        588     404.218 140     109.77

==> SRR27386505.se.tsv <==
++ wc -l
+ SE_NR=0
++ wc -l
+ KE_NR=4323
++ cat /dee2/ref/ecoli/ensembl//Escherichia_coli_str_k_12_substr_mg1655.ASM584v2.36.gtf.cnt
+ SE_CNT=4497
++ cat /dee2/ref/ecoli/ensembl//Escherichia_coli_str_k_12_substr_mg1655.ASM584v2.cdna.all.fa.cnt
+ KE_CNT=4322
+ '[' 0 -eq 4497 -a 4323 -eq 4323 ']'
+ echo 'SRR27386505 An error occurred. Count file line numbers don'\''t match the reference.'
+ tee -a SRR27386505.log
SRR27386505 An error occurred. Count file line numbers don't match the reference.
+ exit1
+ rm '*fastq' '*.sra' SRR27386505.ke.tsv SRR27386505.se.tsv
rm: cannot remove '*fastq': No such file or directory
rm: cannot remove '*.sra': No such file or directory
+ return 1
+ return 1
+ cd /dee2/data/ecoli
+ zip -r /dee2/mnt/SRR27386505.ecoli.zip SRR27386505
  adding: SRR27386505/ (stored 0%)
  adding: SRR27386505/volunteer_pipeline.sh (deflated 77%)
  adding: SRR27386505/SRR27386505.log (deflated 72%)
  adding: SRR27386505/SRR27386505.attempts.txt (deflated 6%)
+ exit```

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions