Perf optimisations by alanmcgovern · Pull Request #5 · NUlliiON/QoiSharp

alanmcgovern · 2021-12-23T21:01:33Z

This builds on top of the previous PRs to leverage the benefits of Span in the encoding process.

Before these changes:

// * Summary *

BenchmarkDotNet=v0.13.1, OS=Windows 10.0.19044.1415 (21H2)
AMD Ryzen 7 3700X, 1 CPU, 16 logical and 8 physical cores
.NET SDK=6.0.100
  [Host]     : .NET 6.0.0 (6.0.21.52210), X64 RyuJIT
  DefaultJob : .NET 6.0.0 (6.0.21.52210), X64 RyuJIT


|         Method |     Mean |     Error |    StdDev |    Gen 0 |    Gen 1 |    Gen 2 | Allocated |
|--------------- |---------:|----------:|----------:|---------:|---------:|---------:|----------:|
| 'QOI Encoding' | 5.057 ms | 0.0997 ms | 0.2292 ms | 234.3750 | 234.3750 | 234.3750 |      1 MB |

NOTE: I did tweak the Encode method to take a pre-allocated buffer, just like the optimised method does. This means the destination byte[] is allocated once and both optimised/unoptimised methods encode to the same buffer repeatedly. The memory stats would be significantly worse otherwise.

With the optimisations:

// * Summary *

BenchmarkDotNet=v0.13.1, OS=Windows 10.0.19044.1415 (21H2)
AMD Ryzen 7 3700X, 1 CPU, 16 logical and 8 physical cores
.NET SDK=6.0.100
  [Host]     : .NET 6.0.0 (6.0.21.52210), X64 RyuJIT
  DefaultJob : .NET 6.0.0 (6.0.21.52210), X64 RyuJIT


|         Method |     Mean |     Error |    StdDev | Allocated |
|--------------- |---------:|----------:|----------:|----------:|
| 'QOI Encoding' | 4.290 ms | 0.0133 ms | 0.0125 ms |      52 B |

Encoding is a zero-allocation operation now (as long as you use a pre-allocated buffer). It's also a fair bit faster.

This is a little more flexible as you can slice the Span and easily decode multiple images from the same underlying byte[]. By using a byte[] parameter you force users to ensure the image they want to decode is available at offset '0' in the array. Additionally, by exposing QoiImage using ReadOnlyMemory<byte> it becomes immutable. The only gotcha is that implicit casting to ReadOnlyMemory<byte> can introduce subtle bugs where the user creates a QoiImage from a byte[], then reuses the byte[] for something else, which then corrupts the QoiImage.

Also fix an unsafe mutable public byte[]. Users can write arbitrary bytes to this which will corrupt the encoder for everyone in that process.

Simplifies things

Remove spaces Use explicit "private" modifier

alanmcgovern added 11 commits December 22, 2021 19:08

Migrate a few more things to BinaryPrimitives

839fae2

Also fix an unsafe mutable public byte[]. Users can write arbitrary bytes to this which will corrupt the encoder for everyone in that process.

perf opt 1

f49afcb

opt2

b005111

opt3

b60e440

opt4

36306d2

greentests

562599f

final_opt

9fa0057

Clamp to the expected number of pixels

4039cba

Pregenerate an RGB copier

94aaf14

Simplifies things

final

ed41a0d

NUlliiON added this to the 2.0.0 milestone Dec 23, 2021

NUlliiON added the performance label Dec 23, 2021

NUlliiON marked this pull request as ready for review December 24, 2021 05:00

NUlliiON marked this pull request as draft December 25, 2021 02:18

NUlliiON marked this pull request as ready for review December 25, 2021 02:19

Refactor

1076132

Remove spaces Use explicit "private" modifier

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Perf optimisations#5

Perf optimisations#5
alanmcgovern wants to merge 12 commits intoNUlliiON:mainfrom
alanmcgovern:perf-optimisations

alanmcgovern commented Dec 23, 2021 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

alanmcgovern commented Dec 23, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

alanmcgovern commented Dec 23, 2021 •

edited

Loading