-
Notifications
You must be signed in to change notification settings - Fork 0
Faster merge functions #6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
BeforeAfter |
|
Oh very nice! |
|
I implemented one optimisation here, which reduces the number of memory accesses. Before the patch, the
→ we now perform one memory access per recursive call |
We achieve this by using a representation of the form (beginning index, ending index) for the slices rather than (beginning index, length). This mostly reduces the number of arithmetic operations in merge_hi, while merge_lo remains more or less untouched. This has a noticeable impact on the in the benchmark!
|
And with the last commit: |
Niols
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments but I approve the changes! Don't we want to update merge as well to use two offsets? Like where is it clever to go from ben/len to ben/end?
| src1 ofs1 len1 | ||
| dest beg | ||
| src0 beg0 end0 x0 | ||
| src1 beg1 end1 x1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be worth having comments like (* x0 = src0.(beg0) *) already at this point. It's readable in the assertions below but I think it's better to have it here!
| assert (end0 >= beg0); | ||
| assert (end1 >= beg1); | ||
|
|
||
| (* This is used to optimise the case len0 = 1 below. *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment does not make sense to me. I think it would deserve more text!
| src1 ofs1 len1 | ||
| dest end_ | ||
| src0 beg0 end0 x0 (* run0 *) | ||
| src1 beg1 end1 x1 (* run1 *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cf before
| assert (x1 = src1.(end1)); | ||
| assert (end0 >= beg0); | ||
| assert (end1 >= beg1); | ||
| (* This is used to optimise the case len1 = 1 below. *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cf before (+ missing space?)
| merge_hi | ||
| cmp | ||
| dest (end_ - 1) | ||
| src0 beg0 (end0 - 1) src0.(end0 - 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use end0 - 1 twice here, is it worth having an intermediary value? (Same in other branch; same in merge_lo.)
Some optimisations on the
merge_{hi,lo}functions