Skip to content

Conversation

@timoxd7
Copy link
Contributor

@timoxd7 timoxd7 commented Apr 30, 2025

Info

Replaced the list in favour of queue as it is perfectly made for this use-case. Should improve performance.

Also, this allows to queue new elements inside a queueDispatch callback.

Btw, thanks for this lib, had exactly this in mind and found it then, saved me a ton of work.

@KyrietS
Copy link
Owner

KyrietS commented May 4, 2025

Hi, huge thanks for the changes! I'll take a look at them next week, promise ☺️

Copy link
Owner

@KyrietS KyrietS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@timoxd7
I've measured the performance before and after your change where you replace std::list with std::queue. The code I used for a benchmark was as follows:

	for (int i = 0; i < 2'000; i++) {
		for (int k = 0; k < 2'000; k++) {
			dispatcher.queue(std::string { "123456789123456789123456789123456789123456789" });
		}
		dispatcher.process();
	}

Here are the results from multiple runs:

Before your change

Visual C++ 2022 (Debug)
8198 ms
8109 ms
8204 ms

Visual C++ 2022 (Release)
816 ms
775 ms
827 ms

GCC 13 (Debug)
666 ms
665 ms
668 ms

GCC 13 (Release)
240 ms
243 ms
241 ms

After your change

Visual C++ 2022 (Debug)
8566 ms
8459 ms
8503 ms

Visual C++ 2022 (Release)
277 ms
292 ms
275 ms

GCC 13 (Debug)
552 ms
551 ms
553 ms

GCC 13 (Release)
122 ms
121 ms
124 ms

Your changes seem to bring a lot of performance gain. Thank you!

But for future reference, I highly encourage you to always prove that something you claim "should improve performance" indeed works. Compilers are very complicated beasts nowadays and sometimes relying on your intuition can be very deceptive.

@KyrietS
Copy link
Owner

KyrietS commented May 11, 2025

Also, this allows to queue new elements inside a queueDispatch callback.

Jus by looking at the code I think it's true. It would be nice to have this scenario covered in unit tests though 🙃

@timoxd7
Copy link
Contributor Author

timoxd7 commented May 11, 2025

Also, this allows to queue new elements inside a queueDispatch callback.

Jus by looking at the code I think it's true. It would be nice to have this scenario covered in unit tests though 🙃

Ok, will add a test next week, agree

@timoxd7
Copy link
Contributor Author

timoxd7 commented May 11, 2025

We could also test if the performance might be worse if constantly adding only one element and processing it directly. I think the performance might even be a bit worse than std::list, due to chunk allocation.

The std::deque (used by std::queue by default) allocates in chunks, not having to copy elements like a std::vector if needed and still having better cache awareness than a std::list. So it makes it a good fit for queue operations.

However, you are right, you never know until you tested it. I just added it for my code and thought it might be useful for you, but did not expect double performance (for your test in release).

I like your development style, good testing and you actually measure if something works 😄

@KyrietS
Copy link
Owner

KyrietS commented May 11, 2025

According to your suggestion I modified my benchmark to make a lot of "single queues":

	for (int i = 0; i < 2'000 * 2'0000; i++) {
		for (int k = 0; k < 1; k++) {
			dispatcher.queue(std::string { "123456789123456789123456789123456789123456789" });
		}
		dispatcher.process();
	}

And the results:

Before

Visual C++ 2022 (Debug)
81089 ms
82427 ms
85448 ms
👆😅

Visual C++ 2022 (Release)
6173 ms
5741 ms
5871 ms

GCC 13 (Debug)
677 ms
687 ms
684 ms

GCC 13 (Release)
205 ms
206 ms
202 ms

After

Visual C++ 2022 (Debug)
83678 ms
84795 ms
84459 ms

Visual C++ 2022 (Release)
1745 ms
1746 ms
1763 ms

GCC 13 (Debug)
586 ms
587 ms
588 ms

GCC 13 (Release)
106 ms
102 ms
102 ms

It's possible that one element is not enough to trigger a chunk deallocation so it's still faster with std::queue. Maybe there is a threshold somewhere, where your code becomes less performant but on a general case I like your changes and the results are great ☺️

@timoxd7 timoxd7 force-pushed the queue-container branch from bcb2ebf to 558a723 Compare May 11, 2025 16:50
@timoxd7 timoxd7 force-pushed the queue-container branch 4 times, most recently from 14e6e16 to 2d2eae8 Compare May 11, 2025 17:52
@timoxd7 timoxd7 force-pushed the queue-container branch from 2d2eae8 to 29afef2 Compare May 11, 2025 17:54
@timoxd7
Copy link
Contributor Author

timoxd7 commented May 11, 2025

@KyrietS I just added some commits (sorry for the force pushes, wanted to have a clean history but use GitHub actions for testing).

Running some local tests using clang on an arm device and -Os also had some interesting results with your test code.

Using all changes from me:
Before: ~800ms
After: ~400ms

Only using the change from std::list to std::queue, but without perfect Forward in queue(...) method:
After: ~630ms

So, the queue has less of an impact as I thought, but the forwarding (which I thought would the compiler optimize to anyway) did the bigger improvement. Maybe in real scenarios, where the nodes of a list might be even farther apart in memory than in the tests, the queue might have a bigger positive impact.

Btw, what is going wrong with Visual C++ 2022? 😅

Copy link
Owner

@KyrietS KyrietS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Force-pushing to your own branches is absolutely fine. It's a good sign that you care about git history ☺️

In my benchmarks I intentionally used "heavy" event (std::string) with dynamic data to take advantage of your move semantics 😄 Thank you for measuring the impact of the perfect forwarding on performance. You really seem to understand the nuances of C++ 💪

Also, the tests you wrote are exactly how I would have written them myself.

I'm more than happy to merge your changes 🙌

@KyrietS KyrietS merged commit 0329db2 into KyrietS:master May 12, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants