Fix array indexing bug in uint8_to_bed_parallel.py#26
Fix array indexing bug in uint8_to_bed_parallel.py#26bentyeh wants to merge 1 commit intohoffmangroup:masterfrom
Conversation
Problem: An IndexError can be raised on line 384 when trying to access an out-of-bounds index of `ar_quant`: ```python out_link.write(str(ar_quant[each_pos]) + "\n") ``` Fix: this commit both simplifies the logic of the code and fixes the bug.
|
This looks great and seemingly correct. This would be best served not merged with master, but the last tagged release to ensure compatibility however old it is. As mentioned before though this software will likely be sunsetted soon in favour of a new solution. I'll have to talk internally how we want to support this software along with possibly expediting its intended successor. Worst case we will leave this up as an intended patch for those running into the same problem in the future and to which we are very grateful for your work. |
|
Thank you for reviewing the pull request. What is the best way to get this into the tagged release? |
|
Bumping this |
|
Sorry for the delay, was on vacation. As mentioned earlier support for this software in general is lower priority and a better solution is imminent. I will update this issue after when it's available which ideally should make this issue moot. If it doesn't, I would more than welcome further discussion. |
Overview
Problem: An IndexError can be raised on line 384 of uint8_to_bed_parallel.py when trying to access an out-of-bounds index of
ar_quant:Fix: this commit both simplifies the logic of the code and fixes the bug.
Details
Relevant variables
uniquely_mappable: single read mappability array of length Nar_quant: multi-read mappability array of length N (length of chromosome)unimap_diff: array of length N - 1poses_start: start position of non-zero "runs"; initially (line 367) calculated as a value 1 less than the desired 0-based (start-open, end-closed) indexposes_end: end position of non-zero "runs"; initially (line 368) calculated as a value 1 less than the desired 0-based (start-open, end-closed) indexHere, I annotate the original code to clarify its logic.
Note that whether the length of the arrays
poses_endandposes_startare the same is ultimately not relevant. We simply need to add a start position of 0 if the first element ofar_quantis non-zero, and add an end position oflen(uniquely_mappable) - 1if the last element ofar_quantis non-zero. This entire code block can therefore be simplifed as follows:Example for a kmer size of 1
uniquely_mappable = [ 0, 0, 1, 1, 1, 0, 0, 1, 1 ]ar_quant = [ 0, 0, 1, 1, 1, 0, 0, 1, 1 ]unimap_diff = [ 0, 1, 0, 0, -1, 0, 1, 0]poses_start = [1, 6]poses_end = [4]--> extended toposes_end = [4, 8]Desired wiggle output
Working backwards, we need
pos_st = [2, 7]andpos_end = [5, 9]for the following loop (lines 383-384) to work as intended:Lines 378 and 379 add a value of 1 to each of the positions from poses_start and poses_end, therefore giving the desired result.