Skip to content

Conversation

@rsarky
Copy link
Contributor

@rsarky rsarky commented Aug 20, 2020

There is a scope for optimising the analysis process when it comes to differential analyses, ie. we already have some existing analyses results in PaStA and some new patches arrive for analyses.
What PaStA currently does is it assigns each of these new patches to a single element cluster, and then it runs the complete analyses again. This results in a lot of redundant comparisons. Example:

Consider the following existing state clusters of PaStA. I have indexed each cluster for illustration purposes:

1. 1 2 3
2. 4 5 7
3. 6 8

PaStA performed around 8*8 comparisons (ignoring other thresholds that PaStA has for now). For further comparisons PaStA will use the representative of each cluster, let's take the first element of each cluster above to be it's representative.ie repr( 1 2 3) = 1.

Now consider that patches 9 and 10 arrive. They will be assigned to their own single element clusters, ie:

1. 1 2 3
2. 4 5 6
3. 6 8
4. 9
5. 10

In the current situation PaStA performs 5x5 comparisons (compare representative of each cluster against the other).
But we can reduce this by only comparing representatives of existing clusters with newly arriving patches as the other comparisons have already been done in the previous step. ie we reduce the comparisons to 3x2. Additionally we will also need to compare the new patches against each other a further 2x2 comparisons. Combined a total of 5x2 comparisons which is still much less than the naive way.

This can be written in a crude mathematical way as follows:

evaluation result = new_patches X existing_patches + new_patches X new_patches [note that existing_patches X existing_patches has already been done and it's result exists in the patch groups file]
Thus, evaluation_result = new_patches X (existing_patches + new_patches)

Things to consider

  • What if we lose the patch groups file at some point? In this case the evaluation will have to be carried out from scratch again as
    the evaluation results that have been cached does not contain all the information

Previously analyse compared all patches under consideration disregarding
previous evaluation results.
This patch adds a new differential flag that utilises the existing
evaluation results and only compares the newly added patches to the
existing ones, reducing the number of comparisons.

The differential evaluation process can be explained as follows:

result = new_patches X existing_patches + new_patches X new_patches
= new_patches X (new_patches + existing_patches)
= new_patches X victims

Signed-off-by: Rohit Sarkar <rohitsarkar5398@gmail.com>
@rsarky rsarky force-pushed the differential-analyses branch from b6ada56 to 5f9d088 Compare August 22, 2020 06:25
@rralf rralf force-pushed the next branch 3 times, most recently from d931c43 to f7b9a6f Compare January 19, 2021 13:56
@rralf rralf force-pushed the next branch 2 times, most recently from 6e20a89 to 3006c20 Compare May 12, 2021 14:57
@rralf rralf force-pushed the next branch 6 times, most recently from f6992dd to 5499f20 Compare May 29, 2021 10:57
@rralf rralf force-pushed the next branch 4 times, most recently from eba830f to 483fccf Compare June 9, 2021 12:37
@rralf rralf force-pushed the next branch 2 times, most recently from 56b12df to 96f2fd9 Compare March 2, 2022 14:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants