-
Notifications
You must be signed in to change notification settings - Fork 2
Improve funkcia algorithm to avoid factorization #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not gonna lie, I don't understand it completely, I would have to go through the math, especially towards the end of funkcia, but thank you so much for this. Looks great, just some tiny things.
|
I also ran a few bigger programs from my advent of code solutions (here), and it looks like there is somehow a significant perf regression on usual programs. My first guess was that we no longer have a special case for the common inputs (a==b), but we do. Second guess was that we inline a function that is very long now, and it might have impact. But I tried to avoid that and handle special cases outside, and perf did not change. So not sure what the reason is right now, hopefully not different Rust versions used to compile it. Might try investigating more later. I saw the same behavior on all programs, one example on the day 3 part 2 program, it was 4% slower than before (a very dumb way of testing, but I repeated it multiple times and it always was clearly slower) # old version
❯ for x in (seq 10); ksplang --stats ksplang/3-2.ksplang --text-input < inputs/3.txt; end
Execution time: 1.214007277s
Instructions executed: 119309701 (98.3M/s)
Execution time: 1.228749622s
Instructions executed: 119309701 (97.1M/s)
Execution time: 1.203303312s
Instructions executed: 119309701 (99.2M/s)
Execution time: 1.238919269s
Instructions executed: 119309701 (96.3M/s)
Execution time: 1.230999355s
Instructions executed: 119309701 (96.9M/s)
Execution time: 1.219730683s
Instructions executed: 119309701 (97.8M/s)
Execution time: 1.233437634s
Instructions executed: 119309701 (96.7M/s)
Execution time: 1.235833891s
Instructions executed: 119309701 (96.5M/s)
Execution time: 1.233339841s
Instructions executed: 119309701 (96.7M/s)
Execution time: 1.211065025s
Instructions executed: 119309701 (98.5M/s)
# new version
❯ for x in (seq 10); ksplang/target/release/ksplang-cli --stats ksplang/3-2.ksplang --text-input < inputs/3.txt; end
Execution time: 1.275257566s
Instructions executed: 119309701 (93.6M/s)
Execution time: 1.269184171s
Instructions executed: 119309701 (94.0M/s)
Execution time: 1.249723653s
Instructions executed: 119309701 (95.5M/s)
Execution time: 1.269717628s
Instructions executed: 119309701 (94.0M/s)
Execution time: 1.292459461s
Instructions executed: 119309701 (92.3M/s)
Execution time: 1.281289692s
Instructions executed: 119309701 (93.1M/s)
Execution time: 1.297168222s
Instructions executed: 119309701 (92.0M/s)
Execution time: 1.286744218s
Instructions executed: 119309701 (92.7M/s)
Execution time: 1.262160706s
Instructions executed: 119309701 (94.5M/s)
Execution time: 1.273051297s
Instructions executed: 119309701 (93.7M/s) |
|
So, after making sure I rebuilt the correct project, and not running a few months old binary, I am seeing about a 2-2.5 times speedup for my ksplang programs, so this is really nice. |
|
I added benchmarks made from your programs, and yes, it's showing a significant speedup. The following run is inverted, baseline is the new implementation, the current run is with the original funkcia implementation I also added bit of comments to the implementation, hopefully it makes the algorithm seem less crazy |
The programs are some of the Sejsel's AOC24 solutions: see
I also fixed a bug: previous implementation would return 0, if the result was supposed to be 1 as a result of modulo operation (i.e. result is 1_000_000_008, modulo result is 1, which was replaced and 0 was returned)