Skip to content

Conversation

@jbedo
Copy link

@jbedo jbedo commented May 31, 2022

Previously upon reading the case of tid == mtid was detected and the
sequence name mapped to "=". This causes missing sequence name errors
upon decompression. As the case of tid == mtid is handled during writing
of sam/bam, this patch simply records the full mate sequence name,
resolving the matching issues.

Example read after decompression pre patch:

SL1344_1_530_0:0:0_0:0:0_6c9    163     SL1344  1       60      70M     *       461     530     AGAGATTACGTCTGGTTGCAAGAGATCATGACAGGGGGAATTGGTTGAAAATAAATATATCGCCAGCAGC  IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII       MQ:i:60 AS:i:70 RG:Z:mysample1  NM:i:0  MC:Z:70M        MD:Z:70 ms:i:2800       XS:i:0

and post patch:

SL1344_1_530_0:0:0_0:0:0_6c9    163     SL1344  1       60      70M     =       461     530     AGAGATTACGTCTGGTTGCAAGAGATCATGACAGGGGGAATTGGTTGAAAATAAATATATCGCCAGCAGC  IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII       MQ:i:60 AS:i:70 RG:Z:mysample1  NM:i:0  MC:Z:70M        MD:Z:70 ms:i:2800       XS:i:0

Previously upon reading the case of tid == mtid was detected and the
sequence name mapped to "=". This causes missing sequence name errors
upon decompression. As the case of tid == mtid is handled during writing
of sam/bam, this patch simply records the full mate sequence name,
resolving the matching issues.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant