Skip to content

optimise ReadBytes#17

Open
jnqnfe wants to merge 1 commit intoruuda:masterfrom
jnqnfe:patch2
Open

optimise ReadBytes#17
jnqnfe wants to merge 1 commit intoruuda:masterfrom
jnqnfe:patch2

Conversation

@jnqnfe
Copy link
Contributor

@jnqnfe jnqnfe commented Oct 19, 2018

Avoids integer expansion, bit shifting, and bit ORing operations. Instead simply reads into an aligned buffer, transmutes, then swaps bytes as necessary.

My original attempted solution was much neater, using a macro to auto-create the type and impl for each integer size, but rust macros aren't up to that yet :/

Feel free to tweak the patch as necessary

Avoids integer expansion, bit shifting, and bit ORing operations

Instead simply reads into an aligned buffer, transmutes, then swaps bytes
as necessary
@jnqnfe
Copy link
Contributor Author

jnqnfe commented Oct 19, 2018

Ah, I see from the travis result that this repr attribute was not stable yet ~1.24

I had wondered why this solution had been overlooked, since the quality of code generally looks pretty good. A Patch for a later time?

@ruuda
Copy link
Owner

ruuda commented Oct 20, 2018

Thank you for taking the time to open this pull request.

I benchmarked this change against master, and I found no strong evidence that it is faster:

	Welch Two Sample t-test

data:  before and after
t = 2.0356, df = 36359, p-value = 0.0418
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 0.0003841598 0.0203188116
sample estimates:
mean of x mean of y 
 13.84424  13.83389 

These are the mean times per sample in nanoseconds, measured over 18217 blocks (5 songs) that were each decoded 140 times, keeping the minimum time for each block to eliminate noise.

It might be that there is a speedup, and more measurements would unveil it, but from my measurements so far, the speedup would be a 0.07% decrease in decode time, which I don't think is worth the extra complexity.

This result makes sense to me, even if the generated code looks better: the functions you modified are cold, they are only used when reading file metadata and the frame headers, and the cost of this is negligible in comparison to reading the frames themselves.

Did you observe a speedup on your system?

@Shnatsel
Copy link
Contributor

Note that starting from Rust 1.32 this can be done safely via u16::from_le_bytes(). If you choose to proceed with this PR, please add a TODO to switch to that implementation once Rust version 1.32 can be assumed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants