-
Notifications
You must be signed in to change notification settings - Fork 25
Open
Description
Hi team,
I'm following your document from https://huishenlab.github.io/biscuit/ to analyse RRBS data.
The command I'm using is
biscuit align -t 12 -M -R "@RG\tID:1\tSM:'$BASE'" $REF $FILE | \
samblaster -M | \
samtools sort -o ${BASE}_mdups_sorted.bam -O BAM -
which is adapted from your docs.
However, samblaster threw out error regarding sorting
samblaster: Loaded 66 header sequence entries.
samblaster: Can't find first and/or second of pair in sam block of length 1 for id: PC140529:356:C3EHVACXX:7:1101:1272:63028
samblaster: At location: *:0
samblaster: Are you sure the input is sorted by read ids?samblaster: Exiting early, the following stats are for processing preceeding the error
samblaster: Marked 8 of 378 (2.116%) total read ids as duplicates using 1556k memory in 0.001S CPU seconds and 2M4S(124S) wall time.
samblaster: Premature exit (return code 1).
I run the pipe step by step and found that the biscuit alignment output sam file has one line of mismatched CIGAR and read length.
The problematic reads is
@PC140529:356:C3EHVACXX:7:1312:19812:54284 1:N:0:ACTTGA
TGGGTGGAAGTGGGGGGGTGGGTTTAGATTGTTAGTGAGAGGAAGAGGTTT
+
DDCDDDDBDDDDDDCCDDDDDEDDDDBB:DB:0DDJJJHFFHFDFDDFBBB
I extracted the read and mapped it separately in biscuit align generated a correct alignment. But somehow, when it is in the fastq file, the alignment went wrong.
Could you please provide some help to fix it?
Thank you!
Metadata
Metadata
Assignees
Labels
No labels