-
Notifications
You must be signed in to change notification settings - Fork 7
Description
Hello,
I wanted to run the DEE2 pipeline as a Docker container on some Arabidopsis samples that were not in the database. However, when running the Docker container, I consistently encounter the following error:
An error occurred. Count file line numbers don't match the reference.
This error appears when running Kallisto. Before the Kallisto step begins, I also receive:
/dee2/code/volunteer_pipeline.sh: line 140: 486 Segmentation fault (core dumped) $STAR --runThreadN $THREADS --quantMode GeneCounts --genomeLoad LoadAndKeep --outSAMtype None --genomeDir $STAR_DIR --readFilesIn=$FQ1 $FQ2
Initially, I thought this might explain why these samples are not in the data archive. However, I encounter the same error when running the Docker container on the examples provided in the README ("SRR2637695, SRR2637696, SRR2637697, SRR2637698") and on the test example "SRR057750".
The results are the same with "latest" and" dev" images.
Do you have any ideas about what might be causing this problem?
If relevant, I'm running the Docker container on Windows workstations.
Many thanks!
Stack trace of "docker run -it mziemann/tallyup:dev -s ecoli -a SRR27386505 -v"
docker run -it mziemann/tallyup:dev -s ecoli -a SRR27386505 -v
+ VERBOSE=TRUE
+ '[' SRR27386505 '!=' NULL ']'
+ MODE=ACCESSION
+ '[' NULL '!=' NULL ']'
+ '[' FALSE == TRUE ']'
+ DEE_DIR=/dee2
+ export -f main
+ cd /dee2
++ find /dee2/ref/
++ grep '/ensembl/star$'
++ sed 's#\/code\/\.\.##'
+ for DIR in '$(find $DEE_DIR/ref/ | grep /ensembl/star$ | sed '\''s#\/code\/\.\.##'\'' )'
+ --genomeLoad Remove --genomeDir /dee2/ref/ecoli/ensembl/star
++ awk '{print $1+$2}'
+++ free
+++ awk '$1 ~ /Mem:/ {print $2-$3}'
+++ free
+++ awk '$1 ~ /Swap:/ {print $2-$3}'
++ echo 15340872 4194304
+ MEM=19535176
++ grep -c '^processor' /proc/cpuinfo
+ NUM_CPUS=16
++ lscpu
++ grep MHz
++ awk '{print $NF}'
++ head -1
+ CPU_SPEED=3393.685
+ ACC_URL=http://dee2.io/acc.html
+ ACC_REQUEST=http://dee2.io/cgi-bin/acc.sh
+ '[' '!' -z ecoli ']'
++ echo 'athaliana celegans dmelanogaster drerio ecoli hsapiens mmusculus rnorvegicus scerevisiae osativa zmays'
++ tr ' ' '\n'
++ grep -wc ecoli
+ ORG_CHECK=1
+ '[' 1 -ne 1 ']'
++ echo 'athaliana 2853904
celegans 2652204
dmelanogaster 3403644
drerio 14616592
ecoli 1576132
hsapiens 28968508
mmusculus 26069664
rnorvegicus 26913880
scerevisiae 1644684
osativa 8000000
zmays 22000000'
++ grep -w ecoli
++ awk -v f=2 '{print $2*f}'
+ MEM_REQD=3152264
+ '[' 3152264 -gt 19535176 ']'
+ '[' -z ecoli ']'
+ export -f myfunc
+ TESTFILE=test_pass
+ '[' '!' -r test_pass ']'
+ echo
+ '[' ACCESSION == FASTQ ']'
+ '[' ACCESSION == SRA_ARCHIVE ']'
+ '[' ACCESSION == ACCESSION ']'
++ echo SRR27386505
++ tr , '\n'
++ cut -c2-3
++ grep -vc RR
+ TESTACCESSIONS=0
+ '[' 0 -eq 0 ']'
++ echo SRR27386505
++ tr , ' '
+ for USER_ACCESSION in '$(echo $MY_ACCESSIONS | tr '\'','\'' '\'' '\'')'
++ pwd
+ DIR=/dee2
+ echo Starting pipeline with species ecoli and accession SRR27386505
Starting pipeline with species ecoli and accession SRR27386505
+ main ORG=ecoli ACCESSION=SRR27386505 VERBOSE=TRUE THREADS=8
++ echo ORG=ecoli ACCESSION=SRR27386505 VERBOSE=TRUE THREADS=8
++ tr ' ' '\n'
++ grep VERBOSE
++ cut -d = -f2
+ VERBOSE=TRUE
+ '[' '!' -z TRUE ']'
+ '[' TRUE == TRUE ']'
+ set -x
++ echo ORG=ecoli ACCESSION=SRR27386505 VERBOSE=TRUE THREADS=8
++ tr ' ' '\n'
++ grep THREADS
++ cut -d = -f2
+ THREADS=8
+ export -f exit1
++ echo ORG=ecoli ACCESSION=SRR27386505 VERBOSE=TRUE THREADS=8
++ tr ' ' '\n'
++ grep ORG
++ cut -d = -f2
+ ORG=ecoli
++ echo ORG=ecoli ACCESSION=SRR27386505 VERBOSE=TRUE THREADS=8
++ tr ' ' '\n'
++ grep -c ACCESSION
+ MODE=1
+ '[' 1 -eq 1 ']'
+ MODE=ACCESSION
++ echo ORG=ecoli ACCESSION=SRR27386505 VERBOSE=TRUE THREADS=8
++ tr ' ' '\n'
++ grep ACCESSION
++ cut -d = -f2
+ SRR=SRR27386505
+ SRR_FILE=SRR27386505.sra
+ echo SRR27386505
SRR27386505
+ wget -O SRR27386505.html https://www.ncbi.nlm.nih.gov/sra/SRR27386505
--2025-05-24 13:35:57-- https://www.ncbi.nlm.nih.gov/sra/SRR27386505
Resolving www.ncbi.nlm.nih.gov (www.ncbi.nlm.nih.gov)... 130.14.29.110, 2607:f220:41e:4290::110, 2607:f220:41e:4290::110
Connecting to www.ncbi.nlm.nih.gov (www.ncbi.nlm.nih.gov)|130.14.29.110|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: 'SRR27386505.html'
SRR27386505.html [ <=> ] 47.17K --.-KB/s in 0.09s
2025-05-24 13:35:58 (512 KB/s) - 'SRR27386505.html' saved [48307]
++ echo ecoli
++ cut -c2-
+ ORG2=coli
++ sed 's/class=/\n/g' SRR27386505.html
++ grep Organism:
++ grep -c coli
+ ORG_OK=1
+ rm SRR27386505.html
+ '[' 1 -ne 1 ']'
+ echo User input species and SRA metadata match. OK.
User input species and SRA metadata match. OK.
+ cd /dee2
+ CODE_DIR=/dee2/code
+ PIPELINE=/dee2/code/volunteer_pipeline.sh
++ md5sum /dee2/code/volunteer_pipeline.sh
++ cut -d ' ' -f1
+ PIPELINE_MD5=bf64e7889548dca6262f0da0a1d6d9ed
+ SW_DIR=/dee2/sw
+ PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/dee2/sw
+ DATA_DIR=/dee2/data/ecoli
+ REF_DIR=/dee2/ref
+ QC_DIR=/dee2/qc
+ BOWTIE2_BUILD=/usr/local/bin/bowtie2-build
+ KALLISTO=/usr/local/bin/kallisto
+ STAR=/usr/local/bin/STAR
+ PREFETCH=/usr/local/bin/prefetch
+ VDB_VALIDATE=/usr/local/bin/vdb-validate
+ FASTQ_DUMP=/usr/local/bin/fastq-dump
+ FASTQC=/usr/local/bin/fastqc
+ NUMAVERAGE=/usr/bin/numaverage
+ NUMROUND=/usr/bin/numround
+ NUMSUM=/usr/bin/numsum
+ PARALLEL_FASTQ_DUMP=/usr/local/bin/parallel-fastq-dump
+ PBZIP2=/usr/bin/pbzip2
+ SKEWER=/usr/local/bin/skewer
+ MINION=/usr/local/bin/minion
+ UNSORT=/usr/bin/unsort
+ FASTX_TRIMMER=/usr/bin/fastx_trimmer
+ FASTQPAIRER=/dee2/code/FastqPairer.pl
+ DISKLIM=32000000
+ DLLIM=1
+ ALNLIM=2
+ MEMALNLIM=4
++ df .
++ awk 'END{print$4}'
+ DISK=990270644
++ awk '{print $1+$2}'
+++ free
+++ awk '$1 ~ /Mem:/ {print $2-$3}'
+++ free
+++ awk '$1 ~ /Swap:/ {print $2-$3}'
++ echo 15335736 4194304
+ MEM=19530040
+ '[' 990270644 -lt 32000000 ']'
+ '[' '!' -d /dee2/qc ']'
+ MYREF_DIR=/dee2/ref/ecoli/ensembl/
+ '[' '!' -d /dee2/ref/ecoli/ensembl/ ']'
+ echo ecoli
ecoli
+ '[' ecoli == athaliana ']'
+ '[' ecoli == celegans ']'
+ '[' ecoli == dmelanogaster ']'
+ '[' ecoli == drerio ']'
+ '[' ecoli == ecoli ']'
+ GTFURL=ftp://ftp.ensemblgenomes.org/pub/bacteria/release-36/gtf/bacteria_0_collection/escherichia_coli_str_k_12_substr_mg1655/Escherichia_coli_str_k_12_substr_mg1655.ASM584v2.36.gtf.gz
+ GDNAURL=ftp://ftp.ensemblgenomes.org/pub/bacteria/release-36/fasta/bacteria_0_collection/escherichia_coli_str_k_12_substr_mg1655/dna/Escherichia_coli_str_k_12_substr_mg1655.ASM584v2.dna_sm.chromosome.Chromosome.fa.gz
+ CDNAURL=ftp://ftp.ensemblgenomes.org/pub/bacteria/release-36/fasta/bacteria_0_collection/escherichia_coli_str_k_12_substr_mg1655/cdna/Escherichia_coli_str_k_12_substr_mg1655.ASM584v2.cdna.all.fa.gz
+ BT2_MD5=9fd53f70df3ba54b851713a514ef3412
+ KAL_MD5=dbc74ab4fa8d55d5f3e88476dc5cc32e
+ STAR_MD5=49dfb0bef4e1c0e34503dc995b9456e5
++ basename ftp://ftp.ensemblgenomes.org/pub/bacteria/release-36/gtf/bacteria_0_collection/escherichia_coli_str_k_12_substr_mg1655/Escherichia_coli_str_k_12_substr_mg1655.ASM584v2.36.gtf.gz .gz
+ GTF=/dee2/ref/ecoli/ensembl//Escherichia_coli_str_k_12_substr_mg1655.ASM584v2.36.gtf
+ '[' -z /dee2/ref/ecoli/ensembl//Escherichia_coli_str_k_12_substr_mg1655.ASM584v2.36.gtf ']'
+ '[' '!' -r /dee2/ref/ecoli/ensembl//Escherichia_coli_str_k_12_substr_mg1655.ASM584v2.36.gtf ']'
++ basename ftp://ftp.ensemblgenomes.org/pub/bacteria/release-36/fasta/bacteria_0_collection/escherichia_coli_str_k_12_substr_mg1655/dna/Escherichia_coli_str_k_12_substr_mg1655.ASM584v2.dna_sm.chromosome.Chromosome.fa.gz .gz
+ GDNA=/dee2/ref/ecoli/ensembl//Escherichia_coli_str_k_12_substr_mg1655.ASM584v2.dna_sm.chromosome.Chromosome.fa
+ '[' -z /dee2/ref/ecoli/ensembl//Escherichia_coli_str_k_12_substr_mg1655.ASM584v2.dna_sm.chromosome.Chromosome.fa ']'
+ '[' '!' -r /dee2/ref/ecoli/ensembl//Escherichia_coli_str_k_12_substr_mg1655.ASM584v2.dna_sm.chromosome.Chromosome.fa ']'
++ basename ftp://ftp.ensemblgenomes.org/pub/bacteria/release-36/fasta/bacteria_0_collection/escherichia_coli_str_k_12_substr_mg1655/cdna/Escherichia_coli_str_k_12_substr_mg1655.ASM584v2.cdna.all.fa.gz .gz
+ CDNA=/dee2/ref/ecoli/ensembl//Escherichia_coli_str_k_12_substr_mg1655.ASM584v2.cdna.all.fa
+ '[' -z /dee2/ref/ecoli/ensembl//Escherichia_coli_str_k_12_substr_mg1655.ASM584v2.cdna.all.fa ']'
+ '[' '!' -r /dee2/ref/ecoli/ensembl//Escherichia_coli_str_k_12_substr_mg1655.ASM584v2.cdna.all.fa ']'
+ BT2_DIR=/dee2/ref/ecoli/ensembl//bowtie2
+ '[' '!' -d /dee2/ref/ecoli/ensembl//bowtie2 ']'
++ basename /dee2/ref/ecoli/ensembl//Escherichia_coli_str_k_12_substr_mg1655.ASM584v2.cdna.all.fa
+ BT2_REF=/dee2/ref/ecoli/ensembl//bowtie2/Escherichia_coli_str_k_12_substr_mg1655.ASM584v2.cdna.all.fa
+ '[' -z /dee2/ref/ecoli/ensembl//bowtie2/Escherichia_coli_str_k_12_substr_mg1655.ASM584v2.cdna.all.fa ']'
+ '[' '!' -r /dee2/ref/ecoli/ensembl//bowtie2/Escherichia_coli_str_k_12_substr_mg1655.ASM584v2.cdna.all.fa ']'
+ KAL_DIR=/dee2/ref/ecoli/ensembl//kallisto
+ '[' '!' -d /dee2/ref/ecoli/ensembl//kallisto ']'
++ basename /dee2/ref/ecoli/ensembl//Escherichia_coli_str_k_12_substr_mg1655.ASM584v2.cdna.all.fa
+ KAL_REF=/dee2/ref/ecoli/ensembl//kallisto/Escherichia_coli_str_k_12_substr_mg1655.ASM584v2.cdna.all.fa.idx
+ '[' -z /dee2/ref/ecoli/ensembl//kallisto/Escherichia_coli_str_k_12_substr_mg1655.ASM584v2.cdna.all.fa.idx ']'
+ '[' '!' -r /dee2/ref/ecoli/ensembl//kallisto/Escherichia_coli_str_k_12_substr_mg1655.ASM584v2.cdna.all.fa.idx ']'
+ STAR_DIR=/dee2/ref/ecoli/ensembl//star
+ '[' '!' -d /dee2/ref/ecoli/ensembl//star ']'
+ '[' '!' -r /dee2/ref/ecoli/ensembl//star/SA ']'
+ '[' '!' -r /dee2/ref/ecoli/ensembl//star/SAindex ']'
+ '[' '!' -d /dee2/data/ecoli ']'
+ cd /dee2/data/ecoli
+ '[' ACCESSION '!=' FASTQ ']'
+ mkdir SRR27386505
+ cp /dee2/code/volunteer_pipeline.sh SRR27386505
+ cd SRR27386505
+ echo 'Starting /dee2/code/volunteer_pipeline.sh SRR27386505
current disk space = 990270644
free memory = 19530040 '
+ tee -a SRR27386505.log
Starting /dee2/code/volunteer_pipeline.sh SRR27386505
current disk space = 990270644
free memory = 19530040
+ ATTEMPTS=SRR27386505.attempts.txt
+ '[' -r SRR27386505.attempts.txt ']'
++ date +%Y-%m-%d:%H:%M:%S
+ DATE=2025-05-24:13:35:58
+ echo /dee2/code/volunteer_pipeline.sh bf64e7889548dca6262f0da0a1d6d9ed 2025-05-24:13:35:58
+ echo SRR27386505 check if SRA file exists and download if neccessary
SRR27386505 check if SRA file exists and download if neccessary
+ '[' ACCESSION == SRA_ARCHIVE ']'
+ '[' '!' -f SRR27386505.sra ']'
+ /usr/local/bin/prefetch -X 9999999999999 SRR27386505
2025-05-24T13:35:59 prefetch.3.0.1: Current preference is set to retrieve SRA Normalized Format files with full base quality scores.
2025-05-24T13:36:00 prefetch.3.0.1: 1) Downloading 'SRR27386505'...
2025-05-24T13:36:00 prefetch.3.0.1: SRA Normalized Format file is being retrieved, if this is different from your preference, it may be due to current file availability.
2025-05-24T13:36:00 prefetch.3.0.1: Downloading via HTTPS...
2025-05-24T13:36:31 prefetch.3.0.1: HTTPS download succeed
2025-05-24T13:36:32 prefetch.3.0.1: 'SRR27386505' is valid
2025-05-24T13:36:32 prefetch.3.0.1: 1) 'SRR27386505' was downloaded successfully
2025-05-24T13:36:32 prefetch.3.0.1: 'SRR27386505' has 0 unresolved dependencies
+ mv /dee2/ncbi/public/sra/SRR27386505.sra .
+ echo SRR27386505 Validate the SRA file
SRR27386505 Validate the SRA file
+ echo SRR27386505 SRAfilesize
+ tee -a SRR27386505.log
SRR27386505 SRAfilesize
+ md5sum SRR27386505.sra
+ tee -a SRR27386505.log
57aa8ef6910f25c78d9d5c503ba89fe1 SRR27386505.sra
++ /usr/local/bin/vdb-validate SRR27386505.sra
++ head -4
++ awk '{print $NF}'
++ grep -c ok
+ VALIDATE_SRA=4
+ '[' 4 -eq 4 ']'
+ echo SRR27386505.sra file validated
+ tee -a SRR27386505.log
SRR27386505.sra file validated
+ echo SRR27386505 diagnose basespace colorspace, single/paired-end and read length
SRR27386505 diagnose basespace colorspace, single/paired-end and read length
+ /usr/local/bin/fastq-dump -X 4000 --split-files SRR27386505.sra
Read 4000 spots for SRR27386505.sra
Written 4000 spots for SRR27386505.sra
++ ls
++ grep SRR27386505
++ grep -v trimmed.fastq
++ grep -c 'fastq$'
+ NUM_FQ=2
+ '[' 2 -eq 1 ']'
+ '[' 2 -eq 2 ']'
+ ORIG_RDS=PE
+ RDS=PE
+ echo SRR27386505 is paired end
+ tee -a SRR27386505.log
SRR27386505 is paired end
++ ls
++ grep SRR27386505
++ grep -m1 'fastq$'
+ FQ1=SRR27386505_1.fastq
+ echo
+ echo Starting FastQC analysis of SRR27386505_1.fastq
Starting FastQC analysis of SRR27386505_1.fastq
+ /usr/local/bin/fastqc -t 8 SRR27386505_1.fastq
Started analysis of SRR27386505_1.fastq
Approx 25% complete for SRR27386505_1.fastq
Approx 50% complete for SRR27386505_1.fastq
Approx 75% complete for SRR27386505_1.fastq
Approx 100% complete for SRR27386505_1.fastq
Analysis complete for SRR27386505_1.fastq
++ basename SRR27386505_1.fastq .fastq
+ FQ1BASE=SRR27386505_1
++ unzip -p SRR27386505_1_fastqc SRR27386505_1_fastqc/fastqc_data.txt
++ grep 'File type'
++ cut -f2
++ awk '{print $1}'
+ BASECALL_ENCODING=Conventional
+ '[' Conventional == Colorspace ']'
+ '[' Conventional == Conventional ']'
+ CSPACE=FALSE
+ echo SRR27386505 is conventional basespace
+ tee -a SRR27386505.log
SRR27386505 is conventional basespace
++ unzip -p SRR27386505_1_fastqc SRR27386505_1_fastqc/fastqc_data.txt
++ grep -wm1 '^Encoding'
++ cut -f2
++ tr -d ' '
+ QUALITY_ENCODING=Sanger/Illumina1.9
++ unzip -p SRR27386505_1_fastqc.zip SRR27386505_1_fastqc/fastqc_data.txt
++ grep 'Sequence length'
++ cut -f2
+ FQ1_LEN=150
+ echo SRR27386505 read1 length is 150 nt
+ tee -a SRR27386505.log
SRR27386505 read1 length is 150 nt
+ unzip -p SRR27386505_1_fastqc.zip SRR27386505_1_fastqc/fastqc_data.txt
+ rm SRR27386505_1_fastqc.zip SRR27386505_1_fastqc.html
++ sed -n 2~4p SRR27386505_1.fastq
++ awk '{print length($1)}'
++ sort -g
++ head -1
+ FQ1_MIN_LEN=150
++ sed -n 2~4p SRR27386505_1.fastq
++ awk '{print length($1)}'
++ /usr/bin/numaverage -M
+ FQ1_MEDIAN_LEN=150
++ sed -n 2~4p SRR27386505_1.fastq
++ awk '{print length($1)}'
++ sort -gr
++ head -1
+ FQ1_MAX_LEN=150
+ FQ2_MIN_LEN=NULL
+ FQ2_MEDIAN_LEN=NULL
+ FQ2_MAX_LEN=NULL
+ '[' PE == PE ']'
++ ls
++ grep SRR27386505
++ grep 'fastq$'
++ sed -n 2p
+ FQ2=SRR27386505_2.fastq
+ echo
+ echo Starting FastQC analysis of SRR27386505_2.fastq
Starting FastQC analysis of SRR27386505_2.fastq
+ /usr/local/bin/fastqc -t 8 SRR27386505_2.fastq
Started analysis of SRR27386505_2.fastq
Approx 25% complete for SRR27386505_2.fastq
Approx 50% complete for SRR27386505_2.fastq
Approx 75% complete for SRR27386505_2.fastq
Approx 100% complete for SRR27386505_2.fastq
Analysis complete for SRR27386505_2.fastq
++ basename SRR27386505_2.fastq .fastq
+ FQ2BASE=SRR27386505_2
++ unzip -p SRR27386505_2_fastqc.zip SRR27386505_2_fastqc/fastqc_data.txt
++ grep 'Sequence length'
++ cut -f2
+ FQ2_LEN=150
+ echo SRR27386505 read2 length is 150 nt
+ tee -a SRR27386505.log
SRR27386505 read2 length is 150 nt
+ unzip -p SRR27386505_2_fastqc.zip SRR27386505_2_fastqc/fastqc_data.txt
+ rm SRR27386505_2_fastqc.zip SRR27386505_2_fastqc.html
++ sed -n 2~4p SRR27386505_2.fastq
++ awk '{print length($1)}'
++ sort -g
++ head -1
+ FQ2_MIN_LEN=150
++ sed -n 2~4p SRR27386505_2.fastq
++ awk '{print length($1)}'
++ /usr/bin/numaverage -M
+ FQ2_MEDIAN_LEN=150
++ sed -n 2~4p SRR27386505_2.fastq
++ awk '{print length($1)}'
++ sort -gr
++ head -1
+ FQ2_MAX_LEN=150
+ [[ 150 -lt 20 ]]
+ '[' FALSE == TRUE ']'
+ echo SRR27386505 Dump the fastq file
SRR27386505 Dump the fastq file
+ rm SRR27386505_1.fastq SRR27386505_2.fastq
+ '[' FALSE == FALSE ']'
+ /usr/local/bin/parallel-fastq-dump --threads 8 --outdir . --split-files --defline-qual + -s SRR27386505.sra
+ '[' PE == PE ']'
+ [[ 150 -ge 20 ]]
+ [[ 150 -lt 20 ]]
+ [[ 150 -lt 20 ]]
++ du -s SRR27386505_1.fastq
++ cut -f1
+ FILESIZE=3907012
+ FILESIZE=3907012
+ echo SRR27386505 file size 3907012
+ tee -a SRR27386505.log
SRR27386505 file size 3907012
+ rm SRR27386505.sra
+ '[' 3907012 -eq 0 ']'
+ echo SRR27386505 completed basic pipeline successfully
+ tee -a SRR27386505.log
SRR27386505 completed basic pipeline successfully
+ echo SRR27386505 Quality trimming
SRR27386505 Quality trimming
+ '[' PE == SE ']'
+ '[' PE == PE ']'
+ /usr/local/bin/skewer -f sanger -l 18 -q 10 -k inf -t 8 -o SRR27386505 SRR27386505_1.fastq SRR27386505_2.fastq
.--. .-.
: .--': :.-.
`. `. : `'.' .--. .-..-..-. .--. .--.
_`, :: . `.' '_.': `; `; :' '_.': ..'
`.__.':_;:_;`.__.'`.__.__.'`.__.':_;
skewer v0.2.2 [April 4, 2016]
Parameters used:
-- 3' end adapter sequence (-x): AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC
-- paired 3' end adapter sequence (-y): AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTA
-- maximum error ratio allowed (-r): 0.100
-- maximum indel error ratio allowed (-d): 0.030
-- end quality threshold (-q): 10
-- minimum read length allowed after trimming (-l): 18
-- file format (-f): Sanger/Illumina 1.8+ FASTQ
-- number of concurrent threads (-t): 8
Sat May 24 13:38:32 2025 >> started
|=================================================>| (100.00%)
Sat May 24 13:39:15 2025 >> done (43.802s)
10628436 read pairs processed; of these:
68 ( 0.00%) short read pairs filtered out after trimming by size control
23 ( 0.00%) empty read pairs filtered out after trimming by size control
10628345 (100.00%) read pairs available; of these:
1762375 (16.58%) trimmed read pairs available after processing
8865970 (83.42%) untrimmed read pairs available after processing
log has been saved to "SRR27386505-trimmed.log".
+ rm SRR27386505_1.fastq SRR27386505_2.fastq
+ FQ1=SRR27386505-trimmed-pair1.fastq
+ FQ2=SRR27386505-trimmed-pair2.fastq
+ '[' '!' -f SRR27386505-trimmed.log ']'
++ grep 'processed; of these:' SRR27386505-trimmed.log
++ awk '{print $1}'
+ READ_CNT_TOTAL=10628436
++ grep 'available; of these:' SRR27386505-trimmed.log
++ awk '{print $1}'
+ READ_CNT_AVAIL=10628345
+ '[' -z 10628345 ']'
+ cat SRR27386505-trimmed.log
+ rm SRR27386505-trimmed.log
+ '[' 10628345 -eq 0 ']'
+ echo 10628345 reads passed initial QC
+ tee -a SRR27386505.log
10628345 reads passed initial QC
+ echo SRR27386505 adapter diagnosis
SRR27386505 adapter diagnosis
+ ADAPTER_THRESHOLD=2
+ '[' PE == SE ']'
+ '[' PE == PE ']'
+ MINION_LOG=SRR27386505-trimmed-pair1.fastq.minion.log
+ /usr/local/bin/minion search-adapter -i SRR27386505-trimmed-pair1.fastq
[minion] reading reads
................................................. 1
................................................. 2
[minion] connected component analysis
[minion] building consensus sequences
++ head SRR27386505-trimmed-pair1.fastq.minion.log
++ grep -m1 sequence=
++ cut -d = -f2
+ ADAPTER1=AACACGTCTTTAGAGCCATCGTCAGGAGTGATGAAGCCGAAGCCTTTGTCAGCGTTGAACCATTTTACGATACCAGTCATTTTACCGGACATAGTGTATTACCTTTAATAATTAAGTGTGCCTTTCGGCGATATGGCGTGCTTTACAGATTTTGAAGCGTTAAAGGAATGTGCACTACGAGGGGTATCAACGATAACTCTTGAAGGGACTTGCCTTACTACACTGCTTTAATGGTCTGTACG
++ head SRR27386505-trimmed-pair1.fastq.minion.log
++ grep -m1 sequence-density=
++ cut -d = -f2
++ /usr/bin/numround -c
+ DENSITY1=1
+ cat SRR27386505-trimmed-pair1.fastq.minion.log
+ tee -a SRR27386505.log
criterion=sequence-density
sequence-density=0.54
sequence-density-rank=1
fanout-score=1.94
fanout-score-rank=44
prefix-density=1.04
prefix-fanout=1.0
sequence=AACACGTCTTTAGAGCCATCGTCAGGAGTGATGAAGCCGAAGCCTTTGTCAGCGTTGAACCATTTTACGATACCAGTCATTTTACCGGACATAGTGTATTACCTTTAATAATTAAGTGTGCCTTTCGGCGATATGGCGTGCTTTACAGATTTTGAAGCGTTAAAGGAATGTGCACTACGAGGGGTATCAACGATAACTCTTGAAGGGACTTGCCTTACTACACTGCTTTAATGGTCTGTACG
criterion=fanout-score
sequence-density=0.08
sequence-density-rank=31
fanout-score=145.06
fanout-score-rank=1
prefix-density=0.90
prefix-fanout=12.2
sequence=TTCAGCCAGAATGGTTGCCGCACGACGAATCGCCTCTTCAGGATCGATTGTGCCGTTGGTTTCCATTTCGATGACCAGCTTGTCCAGGTCGGTACGCTGTTCTACACGCGCTGCTTCAACATTGTAGGCAATACGCTCCACAGGGCTGTAGCATGCGTCGACCAGCAGACGGCCGATTGGGCGCTCATCTTCTTCCGAATGAATTCGGGTAGAAGCCGGCACATAACCACGACCGCGCTGAACTTTGATACGCATGCTAATAGACGCGTTCTCATCGGTCAGGTGGCAGATCACGTGCTGCGGCTTGACGATTTCGACATCACCGTCGTGGGTGATATCGGCTGCAGTCACAGGGCCAATGCCAGATTTATTCAAGGTAAGAATAACTTCATCTTTGCCCTGAACTCTCACCGCCAGCCCTTTCAGGTTGAGCAGGATTTCCAGGATATCTT
+ rm SRR27386505-trimmed-pair1.fastq.minion.log
+ MINION_LOG=SRR27386505-trimmed-pair2.fastq.minion.log
+ /usr/local/bin/minion search-adapter -i SRR27386505-trimmed-pair2.fastq
[minion] reading reads
................................................. 1
................................................. 2
[minion] connected component analysis
[minion] building consensus sequences
++ head SRR27386505-trimmed-pair2.fastq.minion.log
++ grep -m1 sequence=
++ cut -d = -f2
+ ADAPTER2=AGGTTCGAATCCT
++ head SRR27386505-trimmed-pair2.fastq.minion.log
++ grep -m1 sequence-density=
++ cut -d = -f2
++ /usr/bin/numround -c
+ DENSITY2=1
+ cat SRR27386505-trimmed-pair2.fastq.minion.log
+ tee -a SRR27386505.log
criterion=sequence-density
sequence-density=0.38
sequence-density-rank=1
fanout-score=3.64
fanout-score-rank=29
prefix-density=0.40
prefix-fanout=3.5
sequence=AGGTTCGAATCCT
criterion=fanout-score
sequence-density=0.01
sequence-density-rank=34
fanout-score=83.69
fanout-score-rank=1
prefix-density=0.13
prefix-fanout=6.1
sequence=TGGTGGTTGAATACCCGGCGTAATGTTAACCGTCTTGCGATAACAGGTCGCTACGAGTAGAATACTGCCGCTTAACGTCGCGTAAATTGTTTAACACTTTGCGTAACGTACACTGGGATCGCTGAATTAGAGATCGGCGTCCTTTCATTCTATATACTTTGGAGTTTTAAAATGTCTCTAAGTACTGAAGCAACAGCTAAAATCGTTTCTGAGTTTGGTCGTGACGCAAACGACACCGGTTCTACCGAAGTTCAGGTAGCACTGCTGACTGCACAGATC
+ rm SRR27386505-trimmed-pair2.fastq.minion.log
++ echo 1 1
++ awk '{print ($1+$2)/2}'
++ /usr/bin/numround
+ DENSITY=1
+ [[ ! -z 1 ]]
+ '[' 1 -gt 2 ']'
++ echo 10628345 10628436
++ awk '{print $1/$2*100"%"}'
+ QC_PASS_RATE=99.9991%
++ du -s SRR27386505-trimmed-pair1.fastq
++ cut -f1
+ FQSIZE=3836504
+ '[' 3836504 -eq 0 ']'
+ echo SRR27386505 Starting mapping phase
SRR27386505 Starting mapping phase
+ '[' PE == PE ']'
+ echo SRR27386505 testing PE reads STAR mapping to Ensembl genome
+ tee -a SRR27386505.log
SRR27386505 testing PE reads STAR mapping to Ensembl genome
+ head SRR27386505-trimmed-pair1.fastq SRR27386505-trimmed-pair2.fastq
==> SRR27386505-trimmed-pair1.fastq <==
@SRR27386505.1 A00202:1350:HH7CFDSX7:1:1101:26756:1016 length=150
GAAAGAGTCACATCTTAACGGTGAAGCTGAAGTAGAAAAACGTGTTACAGCATCAGTTGGCTCGTGGATCAAGCGACTCAATAGTTGGCTGCGAAAAGAGTTTTAATTTTTATTAGGCCGACGATGATTACGGCCTCAGGCGACAGGC
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
@SRR27386505.2 A00202:1350:HH7CFDSX7:1:1101:27606:1016 length=150
GNGTTGAACCATTTTACGATACCAGTCATTTTACCGGACATAGTGTATTACCTTTAATAATTAAGTGTGCCTTTCGGCGATATGGCGTGCTTTACAGATTTTGAAGCGTTAAAGGAATGTGCACTACGAGGGGTATCAACGATAACTCTT
+
F#FFFFFFFFFFFFFFFFFFFFFFFFFF,FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
@SRR27386505.3 A00202:1350:HH7CFDSX7:1:1101:30481:1016 length=150
GNTTGTTCACACCAGATTAATGAACAGTTCTGGGAGAATGCGGTGACCGGAATAATACGATAGTTCATACTGCCCCTGTTTCGTTAAGCAATTACTACCAGTGCCGTGCTGGCCCGGTATCAATATGCACAAAGTTACTACGTGGATAAT
==> SRR27386505-trimmed-pair2.fastq <==
@SRR27386505.1 A00202:1350:HH7CFDSX7:1:1101:26756:1016 length=150
GCCTGTCGCCTGAGGCCGTAATCATCGTCGGCCTCATAAAAATTAAAACTCTTTTCGCAGCCAACTATTGAGTCGCTTGATCCACGAGCCAACTGATGCTGTAACAAGTTTTTCTACATCAGCTTCACCGTTAAGATGTGACTCTTTA
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF,FFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFF,FFF:F:FF:FF,:FF:FF:FFFFF,,FFFFF,FF,,F:F,:,F:FFF:::FF,F,FF,F,FFF:F,
@SRR27386505.2 A00202:1350:HH7CFDSX7:1:1101:27606:1016 length=150
CGTACAGACCATTAAAGCAGTGTAGTAAGGCAAGTCCCTTCAAGAGTTATCGTTGATACCCCTCGTAGTGCACATTCCTTTAACGCTTCAAAATCTGTAAAGCACGCCATATCGCCGAAAGGCACACTTAATTATTAAAGGTAATACACT
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:F:FFFFFF,FFFFFFFFFFFFFFFFF:FFFFFFF,FFFFFFFFFFFFFFFFFFFFFFFFFFFF,F,FF:FF,FFFFFFFFFFFF,
@SRR27386505.3 A00202:1350:HH7CFDSX7:1:1101:30481:1016 length=150
GATTTCCATATTGAAGGTATCGCGTTAAGCAATATTCGCAAAGCCGCGTTATCTATGCGCGCAGGTGGTGTAGGATATTATCCACGTAGTAACTTTGTGCATATTGATACCGGGCCAGCACGGCACTGGTAGTAATTGCTTAACGAAACA
+ tail SRR27386505-trimmed-pair1.fastq SRR27386505-trimmed-pair2.fastq
==> SRR27386505-trimmed-pair1.fastq <==
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
@SRR27386505.10628435 A00202:1350:HH7CFDSX7:1:2678:1579:37043 length=150
GTCGCTTAAAATACGCAGGCCCGTGATTGCCCATTTGGTGCAGCATGATCAGCATATCTTTGCCGTTATTGGCAGCGACAAAGTCATCTAAGCCAACGAGCATACCGACATCGCGGCATTCGTTATAAGGATTGGTGTTGCAGATGGCGT
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFF:FFFFFFFFFFFFFFFFFF:FFFFF,FFFFFFF::FFFFFFFFFFFFFFFF,
@SRR27386505.10628436 A00202:1350:HH7CFDSX7:1:2678:5014:37043 length=150
GGGAGGAATAAAAAAAACCTTACAATCACTGTAGAAATTCTTTTATACAGCTAATTGATGTGGTCTTTTACTCCTTTCTATAACCTTTTGTCAACTTTAACAAAAGTTTCTTCACATTAGTTTACATAATATCAACACCATTAGCATTTA
+
FFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFF,FFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
==> SRR27386505-trimmed-pair2.fastq <==
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFF,FFFFFFFFFF:F:F,F:FF:FFFFFFFFFFFFFFFFFFFFFFFFFF
@SRR27386505.10628435 A00202:1350:HH7CFDSX7:1:2678:1579:37043 length=150
GCGCAATTTGCCGATTATAAATCCGCGACCAACAACGCCATCTGCAACACCAATCCTTATAACGAATGCCGCGATGTCGGTATGCTCGTTGGCTTAGATGACTTTGTCGCTGCCAATAACGGCAAAGATATGCTGATCATGCTGCACCAA
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFF
@SRR27386505.10628436 A00202:1350:HH7CFDSX7:1:2678:5014:37043 length=150
GCATTAAATGCTAATGGTGTTGATATTATGTAAACTAATGTGAAGAAACTTTTGTTAAAGTTGACAAAAGGTTATAGAAAGGAGTAAAAGACCACATCAATTAGCTGTATAAAAGAATTTATACAGTGATTGTAAGGTTTTTTTTATTCC
+
FFFFFFFFFFFFFFFF:FFFFF:FFFFFFFFFFFFFFFFFFF:FF,FFFFFFFFFFF::FFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFF:FFF:FFFFFFFFFFFFFFFFFFFFFF,FF:F:FFFFFFFFFFF:FFFFFFFFFF,F
+ head -10000 SRR27386505-trimmed-pair1.fastq
+ head -1000000 SRR27386505-trimmed-pair1.fastq
+ tail -90000
+ head -10000 SRR27386505-trimmed-pair2.fastq
+ head -1000000 SRR27386505-trimmed-pair2.fastq
+ tail -90000
+ /usr/bin/fastx_trimmer -f 5 -m 18 -Q 33 -i test_R1.fq
+ /usr/bin/fastx_trimmer -f 5 -m 18 -Q 33 -i test_R2.fq
+ /usr/bin/fastx_trimmer -f 9 -m 18 -Q 33 -i test_R1.fq
+ /usr/bin/fastx_trimmer -f 9 -m 18 -Q 33 -i test_R2.fq
+ /usr/bin/fastx_trimmer -f 13 -m 18 -Q 33 -i test_R1.fq
+ /usr/bin/fastx_trimmer -f 13 -m 18 -Q 33 -i test_R2.fq
+ /usr/bin/fastx_trimmer -f 21 -m 18 -Q 33 -i test_R1.fq
+ wait
+ /usr/bin/fastx_trimmer -f 21 -m 18 -Q 33 -i test_R2.fq
+ /usr/local/bin/STAR --runThreadN 8 --quantMode GeneCounts --genomeLoad LoadAndKeep --outSAMtype None --genomeDir /dee2/ref/ecoli/ensembl//star --readFilesIn=test_R1.fq
/dee2/code/volunteer_pipeline.sh: line 140: 357 Segmentation fault (core dumped) $STAR --runThreadN $THREADS --quantMode GeneCounts --genomeLoad LoadAndKeep --outSAMtype None --genomeDir $STAR_DIR --readFilesIn=test_R1.fq > /dev/null 2>&1
++ sed -n 2~4p
++ wc -l
+ R1_RD_CNT=25000
++ cut -f2 ReadsPerGene.out.tab
++ tail -n +3
++ /usr/bin/numsum
cut: ReadsPerGene.out.tab: No such file or directory
+ MAPPED_CNT=0
++ cut -f2 ReadsPerGene.out.tab
++ head -1
cut: ReadsPerGene.out.tab: No such file or directory
+ UNMAPPED_CNT=
++ echo 0 25000
++ awk '{print $1/$2*100}'
++ /usr/bin/numround
+ R1_MAP_RATE=0
+ /usr/local/bin/STAR --runThreadN 8 --quantMode GeneCounts --genomeLoad LoadAndKeep --outSAMtype None --genomeDir /dee2/ref/ecoli/ensembl//star --readFilesIn=test_R2.fq
/dee2/code/volunteer_pipeline.sh: line 140: 372 Segmentation fault (core dumped) $STAR --runThreadN $THREADS --quantMode GeneCounts --genomeLoad LoadAndKeep --outSAMtype None --genomeDir $STAR_DIR --readFilesIn=test_R2.fq > /dev/null 2>&1
++ sed -n 2~4p
++ wc -l
+ R2_RD_CNT=25000
++ cut -f2 ReadsPerGene.out.tab
++ tail -n +3
++ /usr/bin/numsum
cut: ReadsPerGene.out.tab: No such file or directory
+ MAPPED_CNT=0
++ cut -f2 ReadsPerGene.out.tab
++ head -1
cut: ReadsPerGene.out.tab: No such file or directory
+ UNMAPPED_CNT=
++ echo 0 25000
++ awk '{print $1/$2*100}'
++ /usr/bin/numround
+ R2_MAP_RATE=0
+ /usr/local/bin/STAR --runThreadN 8 --quantMode GeneCounts --genomeLoad LoadAndKeep --outSAMtype None --genomeDir /dee2/ref/ecoli/ensembl//star --readFilesIn=test_R1_clip4.fq
/dee2/code/volunteer_pipeline.sh: line 140: 387 Segmentation fault (core dumped) $STAR --runThreadN $THREADS --quantMode GeneCounts --genomeLoad LoadAndKeep --outSAMtype None --genomeDir $STAR_DIR --readFilesIn=test_R1_clip4.fq > /dev/null 2>&1
++ cut -f2 ReadsPerGene.out.tab
++ tail -n +3
++ /usr/bin/numsum
cut: ReadsPerGene.out.tab: No such file or directory
+ MAPPED_CNT=0
++ echo 0 25000
++ awk '{print ($1/$2*100)-1}'
++ /usr/bin/numround
+ R1_MAP_RATE_CLIP4=-1
+ /usr/local/bin/STAR --runThreadN 8 --quantMode GeneCounts --genomeLoad LoadAndKeep --outSAMtype None --genomeDir /dee2/ref/ecoli/ensembl//star --readFilesIn=test_R2_clip4.fq
/dee2/code/volunteer_pipeline.sh: line 140: 396 Segmentation fault (core dumped) $STAR --runThreadN $THREADS --quantMode GeneCounts --genomeLoad LoadAndKeep --outSAMtype None --genomeDir $STAR_DIR --readFilesIn=test_R2_clip4.fq > /dev/null 2>&1
++ cut -f2 ReadsPerGene.out.tab
++ tail -n +3
++ /usr/bin/numsum
cut: ReadsPerGene.out.tab: No such file or directory
+ MAPPED_CNT=0
++ echo 0 25000
++ awk '{print ($1/$2*100)-1}'
++ /usr/bin/numround
+ R2_MAP_RATE_CLIP4=-1
+ /usr/local/bin/STAR --runThreadN 8 --quantMode GeneCounts --genomeLoad LoadAndKeep --outSAMtype None --genomeDir /dee2/ref/ecoli/ensembl//star --readFilesIn=test_R1_clip8.fq
/dee2/code/volunteer_pipeline.sh: line 140: 405 Segmentation fault (core dumped) $STAR --runThreadN $THREADS --quantMode GeneCounts --genomeLoad LoadAndKeep --outSAMtype None --genomeDir $STAR_DIR --readFilesIn=test_R1_clip8.fq > /dev/null 2>&1
++ cut -f2 ReadsPerGene.out.tab
++ tail -n +3
++ /usr/bin/numsum
cut: ReadsPerGene.out.tab: No such file or directory
+ MAPPED_CNT=0
++ echo 0 25000
++ awk '{print ($1/$2*100)-1}'
++ /usr/bin/numround
+ R1_MAP_RATE_CLIP8=-1
+ /usr/local/bin/STAR --runThreadN 8 --quantMode GeneCounts --genomeLoad LoadAndKeep --outSAMtype None --genomeDir /dee2/ref/ecoli/ensembl//star --readFilesIn=test_R2_clip8.fq
/dee2/code/volunteer_pipeline.sh: line 140: 414 Segmentation fault (core dumped) $STAR --runThreadN $THREADS --quantMode GeneCounts --genomeLoad LoadAndKeep --outSAMtype None --genomeDir $STAR_DIR --readFilesIn=test_R2_clip8.fq > /dev/null 2>&1
++ cut -f2 ReadsPerGene.out.tab
++ tail -n +3
++ /usr/bin/numsum
cut: ReadsPerGene.out.tab: No such file or directory
+ MAPPED_CNT=0
++ echo 0 25000
++ awk '{print ($1/$2*100)-1}'
++ /usr/bin/numround
+ R2_MAP_RATE_CLIP8=-1
+ /usr/local/bin/STAR --runThreadN 8 --quantMode GeneCounts --genomeLoad LoadAndKeep --outSAMtype None --genomeDir /dee2/ref/ecoli/ensembl//star --readFilesIn=test_R1_clip12.fq
/dee2/code/volunteer_pipeline.sh: line 140: 423 Segmentation fault (core dumped) $STAR --runThreadN $THREADS --quantMode GeneCounts --genomeLoad LoadAndKeep --outSAMtype None --genomeDir $STAR_DIR --readFilesIn=test_R1_clip12.fq > /dev/null 2>&1
++ cut -f2 ReadsPerGene.out.tab
++ tail -n +3
++ /usr/bin/numsum
cut: ReadsPerGene.out.tab: No such file or directory
+ MAPPED_CNT=0
++ echo 0 25000
++ awk '{print ($1/$2*100)-1}'
++ /usr/bin/numround
+ R1_MAP_RATE_CLIP12=-1
+ /usr/local/bin/STAR --runThreadN 8 --quantMode GeneCounts --genomeLoad LoadAndKeep --outSAMtype None --genomeDir /dee2/ref/ecoli/ensembl//star --readFilesIn=test_R2_clip12.fq
/dee2/code/volunteer_pipeline.sh: line 140: 432 Segmentation fault (core dumped) $STAR --runThreadN $THREADS --quantMode GeneCounts --genomeLoad LoadAndKeep --outSAMtype None --genomeDir $STAR_DIR --readFilesIn=test_R2_clip12.fq > /dev/null 2>&1
++ cut -f2 ReadsPerGene.out.tab
++ tail -n +3
++ /usr/bin/numsum
cut: ReadsPerGene.out.tab: No such file or directory
+ MAPPED_CNT=0
++ echo 0 25000
++ awk '{print ($1/$2*100)-1}'
++ /usr/bin/numround
+ R2_MAP_RATE_CLIP12=-1
+ /usr/local/bin/STAR --runThreadN 8 --quantMode GeneCounts --genomeLoad LoadAndKeep --outSAMtype None --genomeDir /dee2/ref/ecoli/ensembl//star --readFilesIn=test_R1_clip20.fq
/dee2/code/volunteer_pipeline.sh: line 140: 441 Segmentation fault (core dumped) $STAR --runThreadN $THREADS --quantMode GeneCounts --genomeLoad LoadAndKeep --outSAMtype None --genomeDir $STAR_DIR --readFilesIn=test_R1_clip20.fq > /dev/null 2>&1
++ cut -f2 ReadsPerGene.out.tab
++ tail -n +3
++ /usr/bin/numsum
cut: ReadsPerGene.out.tab: No such file or directory
+ MAPPED_CNT=0
++ echo 0 25000
++ awk '{print ($1/$2*100)-1}'
++ /usr/bin/numround
+ R1_MAP_RATE_CLIP20=-1
+ /usr/local/bin/STAR --runThreadN 8 --quantMode GeneCounts --genomeLoad LoadAndKeep --outSAMtype None --genomeDir /dee2/ref/ecoli/ensembl//star --readFilesIn=test_R2_clip20.fq
/dee2/code/volunteer_pipeline.sh: line 140: 450 Segmentation fault (core dumped) $STAR --runThreadN $THREADS --quantMode GeneCounts --genomeLoad LoadAndKeep --outSAMtype None --genomeDir $STAR_DIR --readFilesIn=test_R2_clip20.fq > /dev/null 2>&1
++ cut -f2 ReadsPerGene.out.tab
++ tail -n +3
++ /usr/bin/numsum
cut: ReadsPerGene.out.tab: No such file or directory
+ MAPPED_CNT=0
++ echo 0 25000
++ awk '{print ($1/$2*100)-1}'
++ /usr/bin/numround
+ R2_MAP_RATE_CLIP20=-1
+ rm test_R1.fq test_R2.fq test_R1_clip4.fq test_R2_clip4.fq test_R1_clip8.fq test_R2_clip8.fq test_R1_clip12.fq test_R2_clip12.fq test_R1_clip20.fq test_R2_clip20.fq ReadsPerGene.out.tab
rm: cannot remove 'ReadsPerGene.out.tab': No such file or directory
++ echo 0:0 -1:4 -1:8 -1:12 -1:20
++ tr ' ' '\n'
++ sort -gr
++ head -1
++ cut -d : -f2
+ R1_CLIP_NUM=0
++ echo 0:0 -1:4 -1:8 -1:12 -1:20
++ tr ' ' '\n'
++ sort -gr
++ head -1
++ cut -d : -f1
+ R1_MAP_RATE=0
++ echo 0:0 -1:4 -1:8 -1:12 -1:20
++ tr ' ' '\n'
++ sort -gr
++ head -1
++ cut -d : -f2
+ R2_CLIP_NUM=0
++ echo 0:0 -1:4 -1:8 -1:12 -1:20
++ tr ' ' '\n'
++ sort -gr
++ head -1
++ cut -d : -f1
+ R2_MAP_RATE=0
+ [[ 0 -gt 0 ]]
+ [[ 0 -gt 0 ]]
+ R1R2_DIFF=0
+ '[' 0 -lt 40 -a 0 -ge 20 ']'
+ R2R1_DIFF=0
+ '[' 0 -lt 40 -a 0 -ge 20 ']'
+ [[ 0 -gt 15 ]]
+ [[ 0 -gt 15 ]]
+ '[' PE == SE ']'
+ '[' PE == PE ']'
+ head SRR27386505-trimmed-pair1.fastq SRR27386505-trimmed-pair2.fastq
==> SRR27386505-trimmed-pair1.fastq <==
@SRR27386505.1 A00202:1350:HH7CFDSX7:1:1101:26756:1016 length=150
GAAAGAGTCACATCTTAACGGTGAAGCTGAAGTAGAAAAACGTGTTACAGCATCAGTTGGCTCGTGGATCAAGCGACTCAATAGTTGGCTGCGAAAAGAGTTTTAATTTTTATTAGGCCGACGATGATTACGGCCTCAGGCGACAGGC
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
@SRR27386505.2 A00202:1350:HH7CFDSX7:1:1101:27606:1016 length=150
GNGTTGAACCATTTTACGATACCAGTCATTTTACCGGACATAGTGTATTACCTTTAATAATTAAGTGTGCCTTTCGGCGATATGGCGTGCTTTACAGATTTTGAAGCGTTAAAGGAATGTGCACTACGAGGGGTATCAACGATAACTCTT
+
F#FFFFFFFFFFFFFFFFFFFFFFFFFF,FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
@SRR27386505.3 A00202:1350:HH7CFDSX7:1:1101:30481:1016 length=150
GNTTGTTCACACCAGATTAATGAACAGTTCTGGGAGAATGCGGTGACCGGAATAATACGATAGTTCATACTGCCCCTGTTTCGTTAAGCAATTACTACCAGTGCCGTGCTGGCCCGGTATCAATATGCACAAAGTTACTACGTGGATAAT
==> SRR27386505-trimmed-pair2.fastq <==
@SRR27386505.1 A00202:1350:HH7CFDSX7:1:1101:26756:1016 length=150
GCCTGTCGCCTGAGGCCGTAATCATCGTCGGCCTCATAAAAATTAAAACTCTTTTCGCAGCCAACTATTGAGTCGCTTGATCCACGAGCCAACTGATGCTGTAACAAGTTTTTCTACATCAGCTTCACCGTTAAGATGTGACTCTTTA
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF,FFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFF,FFF:F:FF:FF,:FF:FF:FFFFF,,FFFFF,FF,,F:F,:,F:FFF:::FF,F,FF,F,FFF:F,
@SRR27386505.2 A00202:1350:HH7CFDSX7:1:1101:27606:1016 length=150
CGTACAGACCATTAAAGCAGTGTAGTAAGGCAAGTCCCTTCAAGAGTTATCGTTGATACCCCTCGTAGTGCACATTCCTTTAACGCTTCAAAATCTGTAAAGCACGCCATATCGCCGAAAGGCACACTTAATTATTAAAGGTAATACACT
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:F:FFFFFF,FFFFFFFFFFFFFFFFF:FFFFFFF,FFFFFFFFFFFFFFFFFFFFFFFFFFFF,F,FF:FF,FFFFFFFFFFFF,
@SRR27386505.3 A00202:1350:HH7CFDSX7:1:1101:30481:1016 length=150
GATTTCCATATTGAAGGTATCGCGTTAAGCAATATTCGCAAAGCCGCGTTATCTATGCGCGCAGGTGGTGTAGGATATTATCCACGTAGTAACTTTGTGCATATTGATACCGGGCCAGCACGGCACTGGTAGTAATTGCTTAACGAAACA
+ tail SRR27386505-trimmed-pair1.fastq SRR27386505-trimmed-pair2.fastq
==> SRR27386505-trimmed-pair1.fastq <==
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
@SRR27386505.10628435 A00202:1350:HH7CFDSX7:1:2678:1579:37043 length=150
GTCGCTTAAAATACGCAGGCCCGTGATTGCCCATTTGGTGCAGCATGATCAGCATATCTTTGCCGTTATTGGCAGCGACAAAGTCATCTAAGCCAACGAGCATACCGACATCGCGGCATTCGTTATAAGGATTGGTGTTGCAGATGGCGT
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFF:FFFFFFFFFFFFFFFFFF:FFFFF,FFFFFFF::FFFFFFFFFFFFFFFF,
@SRR27386505.10628436 A00202:1350:HH7CFDSX7:1:2678:5014:37043 length=150
GGGAGGAATAAAAAAAACCTTACAATCACTGTAGAAATTCTTTTATACAGCTAATTGATGTGGTCTTTTACTCCTTTCTATAACCTTTTGTCAACTTTAACAAAAGTTTCTTCACATTAGTTTACATAATATCAACACCATTAGCATTTA
+
FFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFF,FFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
==> SRR27386505-trimmed-pair2.fastq <==
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFF,FFFFFFFFFF:F:F,F:FF:FFFFFFFFFFFFFFFFFFFFFFFFFF
@SRR27386505.10628435 A00202:1350:HH7CFDSX7:1:2678:1579:37043 length=150
GCGCAATTTGCCGATTATAAATCCGCGACCAACAACGCCATCTGCAACACCAATCCTTATAACGAATGCCGCGATGTCGGTATGCTCGTTGGCTTAGATGACTTTGTCGCTGCCAATAACGGCAAAGATATGCTGATCATGCTGCACCAA
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFF
@SRR27386505.10628436 A00202:1350:HH7CFDSX7:1:2678:5014:37043 length=150
GCATTAAATGCTAATGGTGTTGATATTATGTAAACTAATGTGAAGAAACTTTTGTTAAAGTTGACAAAAGGTTATAGAAAGGAGTAAAAGACCACATCAATTAGCTGTATAAAAGAATTTATACAGTGATTGTAAGGTTTTTTTTATTCC
+
FFFFFFFFFFFFFFFF:FFFFF:FFFFFFFFFFFFFFFFFFF:FF,FFFFFFFFFFF::FFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFF:FFF:FFFFFFFFFFFFFFFFFFFFFF,FF:F:FFFFFFFFFFF:FFFFFFFFFF,F
+ /usr/local/bin/STAR --runThreadN 8 --quantMode GeneCounts --genomeLoad LoadAndKeep --outSAMtype None --genomeDir /dee2/ref/ecoli/ensembl//star --readFilesIn=SRR27386505-trimmed-pair1.fastq SRR27386505-trimmed-pair2.fastq
/dee2/code/volunteer_pipeline.sh: line 140: 486 Segmentation fault (core dumped) $STAR --runThreadN $THREADS --quantMode GeneCounts --genomeLoad LoadAndKeep --outSAMtype None --genomeDir $STAR_DIR --readFilesIn=$FQ1 $FQ2
++ grep 'Uniquely mapped reads number' Log.final.out
++ awk '{print $NF}'
grep: Log.final.out: No such file or directory
+ UNIQ_MAPPED_READS=
+ cat Log.final.out
+ tee -a SRR27386505.log
cat: Log.final.out: No such file or directory
+ rm Log.final.out Log.out Log.progress.out SJ.out.tab
rm: cannot remove 'Log.final.out': No such file or directory
rm: cannot remove 'Log.out': No such file or directory
rm: cannot remove 'Log.progress.out': No such file or directory
rm: cannot remove 'SJ.out.tab': No such file or directory
+ head -4 ReadsPerGene.out.tab
+ tee -a SRR27386505.log
head: cannot open 'ReadsPerGene.out.tab' for reading: No such file or directory
+ mv ReadsPerGene.out.tab SRR27386505.se.tsv
mv: cannot stat 'ReadsPerGene.out.tab': No such file or directory
+ echo SRR27386505 diagnose strandedness now
SRR27386505 diagnose strandedness now
++ cut -f2 SRR27386505.se.tsv
++ tail -n +5
++ /usr/bin/numsum
cut: SRR27386505.se.tsv: No such file or directory
+ UNSTRANDED_CNT=0
++ cut -f3 SRR27386505.se.tsv
++ tail -n +5
++ /usr/bin/numsum
cut: SRR27386505.se.tsv: No such file or directory
+ POS_STRAND_CNT=0
++ cut -f4 SRR27386505.se.tsv
++ tail -n +5
++ /usr/bin/numsum
cut: SRR27386505.se.tsv: No such file or directory
+ NEG_STRAND_CNT=0
+ echo 'UnstrandedReadsAssigned:0 PositiveStrandReadsAssigned:0 NegativeStrandReadsAssigned:0'
+ tee -a SRR27386505.log
UnstrandedReadsAssigned:0 PositiveStrandReadsAssigned:0 NegativeStrandReadsAssigned:0
+ '[' 0 -ge 0 ']'
+ STRAND=1
+ STRANDED=PositiveStrand
+ KALLISTO_STRAND_PARAMETER=--fr-stranded
+ echo 'Dataset is classified positive stranded'
+ tee -a SRR27386505.log
Dataset is classified positive stranded
+ echo KALLISTO_STRAND_PARAMETER=--fr-stranded
KALLISTO_STRAND_PARAMETER=--fr-stranded
+ CUTCOL=3
++ cut -f3 SRR27386505.se.tsv
++ head -1
cut: SRR27386505.se.tsv: No such file or directory
+ UNMAPPED_CNT=
++ cut -f3 SRR27386505.se.tsv
++ head -2
++ tail -1
cut: SRR27386505.se.tsv: No such file or directory
+ MULTIMAPPED_CNT=
++ cut -f3 SRR27386505.se.tsv
++ head -3
++ tail -1
cut: SRR27386505.se.tsv: No such file or directory
+ NOFEATURE_CNT=
++ cut -f3 SRR27386505.se.tsv
++ head -4
++ tail -1
cut: SRR27386505.se.tsv: No such file or directory
+ AMBIGUOUS_CNT=
++ cut -f3 SRR27386505.se.tsv
++ tail -n +5
++ /usr/bin/numsum
cut: SRR27386505.se.tsv: No such file or directory
+ ASSIGNED_CNT=0
++ echo 10628345
++ awk '{print $1/$2*100"%"}'
+ UNIQ_MAP_RATE=inf%
++ echo 0 10628345
++ awk '{print $1/$2*100"%"}'
+ ASSIGNED_RATE=0%
+ CUTCOL=3
+ cut -f1,3 SRR27386505.se.tsv
+ tail -n +5
cut: SRR27386505.se.tsv: No such file or directory
+ mv SRR27386505.se.tsv.tmp SRR27386505.se.tsv
+ echo SRR27386505 checking readlengths now for kmer selection
SRR27386505 checking readlengths now for kmer selection
++ sed -n 2~4p SRR27386505-trimmed-pair1.fastq
++ head -1000000
++ awk '{print length}'
++ sort -n
++ awk '{all[NR] = $0} END{print all[int(NR*0.50 - 0.5)]}'
+ MEDIAN_LENGTH=150
++ sed -n 2~4p SRR27386505-trimmed-pair1.fastq
++ head -1000000
++ awk '{print length}'
++ sort -n
++ awk '{all[NR] = $0} END{print all[int(NR*0.20 - 0.5)]}'
+ D20=150
+ KMER=146
++ echo 146
++ awk '{print ($1+1)%2}'
+ ADJUST=1
+ KMER=145
+ '[' 145 -lt 19 ']'
+ echo MeadianReadLen=150 20thPercentileLength=150 echo kmer=145
+ tee -a SRR27386505.log
MeadianReadLen=150 20thPercentileLength=150 echo kmer=145
+ '[' 145 -lt 31 ']'
+ KMER=31
+ echo SRR27386505 running kallisto now
SRR27386505 running kallisto now
+ '[' PE == SE ']'
+ '[' PE == PE ']'
+ echo SRR27386505 Starting Kallisto paired end mapping to ensembl reference transcriptome
+ tee -a SRR27386505.log
SRR27386505 Starting Kallisto paired end mapping to ensembl reference transcriptome
+ /usr/local/bin/kallisto quant --fr-stranded -t 8 -o . -i /dee2/ref/ecoli/ensembl//kallisto/Escherichia_coli_str_k_12_substr_mg1655.ASM584v2.cdna.all.fa.idx SRR27386505-trimmed-pair1.fastq SRR27386505-trimmed-pair2.fastq
+ tee -a SRR27386505.log
[quant] fragment length distribution will be estimated from the data
[index] k-mer length: 31
[index] number of targets: 4,322
[index] number of k-mers: 3,915,872
[index] number of equivalence classes: 4,486
[quant] running in paired-end mode
[quant] will process pair 1: SRR27386505-trimmed-pair1.fastq
SRR27386505-trimmed-pair2.fastq
[quant] finding pseudoalignments for the reads ... done
[quant] processed 10,628,345 reads, 829,354 reads pseudoaligned
[quant] estimated average fragment length: 185.15
[ em] quantifying the abundances ... done
[ em] the Expectation-Maximization algorithm ran for 52 rounds
+ mv abundance.tsv SRR27386505.ke.tsv
+ rm abundance.h5
++ grep 'reads pseudoaligned' SRR27386505.log
++ awk '{print $(NF-2)}'
++ tr -d ,
+ PSEUDOMAPPED_CNT=829354
++ echo 829354 10628345
++ awk '{print $1/$2*100"%"}'
+ PSEUDOMAP_RATE=7.80323%
+ rm -rf run_info.json SRR27386505-trimmed-pair1.fastq SRR27386505-trimmed-pair2.fastq _STARgenome
+ wc -l SRR27386505.ke.tsv SRR27386505.se.tsv
+ tee -a SRR27386505.log
4323 SRR27386505.ke.tsv
0 SRR27386505.se.tsv
4323 total
+ head SRR27386505.ke.tsv SRR27386505.se.tsv
+ tee -a SRR27386505.log
==> SRR27386505.ke.tsv <==
target_id length eff_length est_counts tpm
AAC73112 66 10.7059 23 680.888
AAC73113 2463 2278.85 181 25.1729
AAC73114 933 748.85 55 23.2776
AAC73115 1287 1102.85 100 28.7378
AAC73116 297 123.557 86 220.598
AAC73117 777 592.85 50 26.7298
AAC73118 1431 1246.85 195 49.5668
AAC73119 954 769.85 790 325.23
AAC73120 588 404.218 140 109.77
==> SRR27386505.se.tsv <==
++ wc -l
+ SE_NR=0
++ wc -l
+ KE_NR=4323
++ cat /dee2/ref/ecoli/ensembl//Escherichia_coli_str_k_12_substr_mg1655.ASM584v2.36.gtf.cnt
+ SE_CNT=4497
++ cat /dee2/ref/ecoli/ensembl//Escherichia_coli_str_k_12_substr_mg1655.ASM584v2.cdna.all.fa.cnt
+ KE_CNT=4322
+ '[' 0 -eq 4497 -a 4323 -eq 4323 ']'
+ echo 'SRR27386505 An error occurred. Count file line numbers don'\''t match the reference.'
+ tee -a SRR27386505.log
SRR27386505 An error occurred. Count file line numbers don't match the reference.
+ exit1
+ rm '*fastq' '*.sra' SRR27386505.ke.tsv SRR27386505.se.tsv
rm: cannot remove '*fastq': No such file or directory
rm: cannot remove '*.sra': No such file or directory
+ return 1
+ return 1
+ cd /dee2/data/ecoli
+ zip -r /dee2/mnt/SRR27386505.ecoli.zip SRR27386505
adding: SRR27386505/ (stored 0%)
adding: SRR27386505/volunteer_pipeline.sh (deflated 77%)
adding: SRR27386505/SRR27386505.log (deflated 72%)
adding: SRR27386505/SRR27386505.attempts.txt (deflated 6%)
+ exit```