Conversation
7ac5a55 to
e4fb0a0
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #51 +/- ##
==========================================
+ Coverage 85.82% 86.14% +0.31%
==========================================
Files 11 11
Lines 1912 1920 +8
==========================================
+ Hits 1641 1654 +13
+ Misses 271 266 -5 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
e4fb0a0 to
cacd80b
Compare
|
Interesting, I thought LLVM could optimize it. would you mind improve u64 as well? |
cacd80b to
5912391
Compare
|
I went all in and changed it for all numeric types, as even for 16-bit, there are less assembly instructions (though pobably not measurable: https://godbolt.org/z/ov97Pq43j I did it for all as I'm a fan of code symmetry. If I should only do it for the 64-bit types, I'm happy to change it again. I've also removed the benchmark, as it was for illustration purpose only and I don't think it makes sense to keep it around (less bit rot). |
Copy the bytes into a buffer instead of destructuring them. On my machine it's an about 15% performance improvement for f64. For all types, even if it's 16-bit only, the resulting assembly has less instructions: https://godbolt.org/z/ov97Pq43j
5912391 to
f3ece39
Compare
|
Thank you! |
|
Thanks for merging. I'm still looking into another encoding bottleneck. It could well be that it's not a cbor4ii thing, but related to other parts of the library. I just wanted to mention as you might want to wait for a new release until all the changes I'm currently working on are in. |
|
The other bottleneck was something else. So don't expect any perf related PRs for now. For anyone who's reading this and might be interested what the other performance things was. I was using the BufWriter from cbor4ii. That is slower than the one from the Rust standard library. |
Copy the bytes into a buffer instead of destructuring them. On my machine it's an about 15% performance improvement.
A benchmark was added, which can be run via
I've only optimized the f64 code path as this is what I discovered, when I was switching a project over to cbor4ii. If other encodings should be changed/benchmarked as well, let me know.