Skip to content
This repository was archived by the owner on Jan 31, 2020. It is now read-only.

Fix multi-threading bug that causes incorrect results#106

Open
jmarshall wants to merge 1 commit intogenome:masterfrom
jmarshall:fix-threads
Open

Fix multi-threading bug that causes incorrect results#106
jmarshall wants to merge 1 commit intogenome:masterfrom
jmarshall:fix-threads

Conversation

@jmarshall
Copy link
Contributor

@jmarshall jmarshall commented May 28, 2019

We have observed that, for a fraction of the events detected in our data files, Pindel output varies with the number of threads Pindel uses. Specifically the per-sample per-strand counts are incorrect — 350+51 = 401 = 277+124, i.e., the total number of supporting reads is unchanged, but some of the reads have been assigned to the wrong strand:

T=1    D 1	…  sampleA 477 477 0 0 0 0	sampleB 443 443 277 277 124 124
T=2    D 1	…  sampleA 477 477 0 0 0 0	sampleB 443 443 277 277 124 124
T=3    D 1	…  sampleA 477 477 0 0 0 0	sampleB 443 443 277 277 124 124
T=4    D 1	…  sampleA 477 477 0 0 0 0	sampleB 443 443 277 277 124 124
…
T=16   D 1	…  sampleA 477 477 0 0 0 0	sampleB 443 443 350 350 51 51

Output was unchanged for low threading settings, but starts to differ at T=7 and by T=16 is dramatically incorrect.

This is fixed (and the output no longer varies with the number of threads used) by correcting ReadBuffer::flush() to maintain the order of m_rawreads[] entries when they are copied into m_filteredReads[] regardless of threading indeterminacy.

I suspect this patch may also fix or affect #26, which appears to be a similar problem.

Having `m_filteredReads.push_back(m_rawreads[i])` in a loop that is
parallel for `i` meant that entries could be pushed onto `m_filteredReads`
in arbitrary order when multiple threads are used.

Instead store pointers in a fixed-size array in the parallel loop,
and squash them sequentially into `m_filteredReads` afterwards.
Fixes a bug whereby incorrect output was produced when multiple
threads were used (differences observed beyond six threads).
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant