Skip to content

Improve performance without breaking API#7

Open
alanmcgovern wants to merge 5 commits intoNUlliiON:mainfrom
alanmcgovern:non-breaking-updates
Open

Improve performance without breaking API#7
alanmcgovern wants to merge 5 commits intoNUlliiON:mainfrom
alanmcgovern:non-breaking-updates

Conversation

@alanmcgovern
Copy link

The original PR was pushed a bit too late and
wasn't reviewed before 1.0 was released. As such,
the changes can't be included now without breaking
API.

Redo some of the changes to gain most of the benefits
without the API breakage.

  1. Nothing in the library uses the public mutable
    state defined in QoiCodec.These properties/fields
    have been replaced with immutable/readonly properties
    instead.

  2. QoiEncoder has been optimised a bit to leverage
    the performance gains from BinaryPrimitives and Span.
    Also, a Span based overload was added so that images
    could be encoded into a reusable buffer.

  3. QoiDecoder has a new overload which receives a
    Span, allowing images to be decoded from anything
    which can create a Span, and also allows for
    multiple images to be decoded from the same underlying
    array.

The original PR was pushed a bit too late and
wasn't reviewed before 1.0 was released. As such,
the changes can't be included now without breaking
API.

Redo some of the changes to gain most of the benefits
without the API breakage.

1. Nothing in the library uses the public mutable
state defined in QoiCodec.These properties/fields
have been replaced with immutable/readonly properties
instead.

2. QoiEncoder has been optimised a bit to leverage
the performance gains from BinaryPrimitives and Span<T>.
Also, a Span<T> based overload was added so that images
could be encoded into a reusable buffer.

3. QoiDecoder has a new overload which receives a
Span<T>, allowing images to be decoded from anything
which can create a Span<byte>, and also allows for
multiple images to be decoded from the same underlying
array.
@alanmcgovern
Copy link
Author

@NUlliiON I closed the other PRs and submitted this instead.

Some of the changes from the older PRs are still needed, but at least this one is API compatible with the current 1.0 release.

@alanmcgovern
Copy link
Author

alanmcgovern commented Dec 29, 2021

Optimised:

Method Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
QoiEncoding 5.273 ms 0.0221 ms 0.0185 ms 164.0625 164.0625 164.0625 2 MB
Method Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
QoiDecoding 3.064 ms 0.0091 ms 0.0085 ms 496.0938 496.0938 496.0938 2 MB

Original:

Method Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
QoiEncoding 5.524 ms 0.0252 ms 0.0224 ms 171.8750 171.8750 171.8750 2 MB
Method Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
QoiDecoding 3.978 ms 0.0794 ms 0.1326 ms 496.0938 496.0938 496.0938 2 MB

NOTE: There are further improvements possible but I want to get a few PRs landed before I start adding new API. Specifically, i want the fix-tests branch landed so I don't have to keep modifying benchmarks to compare.

Remove explicit usings
Remove space
Use explicit type for base type (uint)
Remove redundant else
@Terria-K
Copy link

This optimization does have a weird bugs. Comparing these two images, the optimized code has a two extra black pixels while the original code doesn't have one.

Looking at the North West tile textures or the left ledge textures, you can see those two black pixels.

(Original)
image

(Optimized)
image

@bwiklund
Copy link

This optimization does have a weird bugs. Comparing these two images, the optimized code has a two extra black pixels while the original code doesn't have one.

Looking at the North West tile textures or the left ledge textures, you can see those two black pixels.

I don't see any difference, even putting these on top of each other with a "difference" blend in aseprite. (other than the time and moved crate) are you sure these are the right images?

@Terria-K
Copy link

This optimization does have a weird bugs. Comparing these two images, the optimized code has a two extra black pixels while the original code doesn't have one.

Looking at the North West tile textures or the left ledge textures, you can see those two black pixels.

I don't see any difference, even putting these on top of each other with a "difference" blend in aseprite. (other than the time and moved crate) are you sure these are the right images?

if you look closely at the corner, you can see these two black pixels in the optimized version, which shouldn't be appearing.Screenshot_2023-11-15-08-01-57-10_40deb401b9ffe8e1df2f1cc5ba480b12.jpg

@bwiklund
Copy link

ah, that's probably from the changes in the way the lookup table is initialized. seems like an easy fix.

in any case i don't think the author of this repo is maintaining it anymore, someone (@alanmcgovern ?) should fork it. we'd swap from this repo for the loading time boost for sure.

@bwiklund
Copy link

also @alanmcgovern it'd be good to see a benchmark for a large atlas-sized image (like 2k, 4k, 8k)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants