Skip to content

Switch to long running tasks for the multiplexer#344

Draft
bdraco wants to merge 132 commits intoNabuCasa:mainfrom
bdraco:long_running_tasks
Draft

Switch to long running tasks for the multiplexer#344
bdraco wants to merge 132 commits intoNabuCasa:mainfrom
bdraco:long_running_tasks

Conversation

@bdraco
Copy link
Copy Markdown
Collaborator

@bdraco bdraco commented Feb 9, 2025

Performance improvements:

  • This reduced the amount of tasks being created while accessing the UI by ~96%
  • RangedTimeout reduced the number of created TimerHandles by ~94% while accessing the UI by

~2x performance improvement for small packets (common case) https://codspeed.io/NabuCasa/snitun/branches/bdraco%3Along_running_tasks

Some considerations

  • The timeout is now enforced for the read/write of the message so if it takes more than timeout to read or write the message it will timeout. Thats probably not a problem since the ping/pongs would be blocked and fail the connction anyways.

While these changes are designed for the client side, this combination of these changes is expected significantly increase the number of active connections a serve can handle at one time.

Instead of creating many small tasks, create two long running ones for the reader/writer to avoid flooding the event loop with tasks when the connection is generating many packets.

This is still a bit of a WIP, but should be deferred until after a release with #303 so they get separate release cycles

Known issues

@bdraco
Copy link
Copy Markdown
Collaborator Author

bdraco commented Feb 9, 2025

We could probably have a single low res TimerHandle that resets on each loop and has a 10s leeway to re-arm. So we ensure we cancel both read/write tasks within timeout+10s. That we we aren't churning timer handles

EDIT: with the task overhead gone, that is where all the time gets spent

@bdraco
Copy link
Copy Markdown
Collaborator Author

bdraco commented Feb 9, 2025

That test fails locally before and after this change so I think its intent is to overload the queue but its too fast now, and my local machine is too fast as well before the change.

The queue size needs to be patched to be smaller so it can be overrun before it can be processed

@bdraco bdraco changed the title Switch to long runner tasks for the multiplexer DEFER: Switch to long runner tasks for the multiplexer Feb 9, 2025
@bdraco
Copy link
Copy Markdown
Collaborator Author

bdraco commented Feb 9, 2025

Maybe a RangedTimeout that does a callback on timeout
So we can timeout between 90 and 100s and only reschedule timer handles if outside range

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant