-
Notifications
You must be signed in to change notification settings - Fork 11
fix: don't send Disconnect until peer has been removed from peer_map #199
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughModified peer cleanup in Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Possibly related PRs
Suggested reviewers
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🧰 Additional context used🧠 Learnings (2)📓 Common learnings📚 Learning: 2025-08-08T08:41:38.069ZApplied to files:
🧬 Code graph analysis (1)crates/tx5/src/peer.rs (2)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
🔇 Additional comments (3)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
2366e81 to
967e6e0
Compare
967e6e0 to
f189357
Compare
89d8745 to
0429b26
Compare
0429b26 to
8f856ff
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
crates/tx5/src/peer.rs (1)
243-248: Drop guard wiring is sound; explicitdrop(drop_guard)is redundant but fineBinding the
DropPeerasdrop_guardat the top oftaskgives you consistent cleanup on all early returns and on task abort (viaPeer::dropaborting the join handle). The explicitdrop(drop_guard)after the recv loop is technically unnecessary—RAII would drop it on function return—but it does make the cleanup point more explicit in the “normal loop finished” path and doesn’t change semantics.If you prefer to trim a line, you could rely on natural scope drop and remove the explicit
drop(drop_guard), but it’s not required.Also applies to: 363-368
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
crates/tx5-connection/src/conn.rs(1 hunks)crates/tx5/src/peer.rs(3 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: ThetaSinner
Repo: holochain/tx5 PR: 167
File: crates/tx5/tests/tests/flaky_sig.rs:73-84
Timestamp: 2025-08-08T08:41:38.069Z
Learning: Repo holochain/tx5: In test code (crates/tx5/tests/tests/flaky_sig.rs), maintainer (ThetaSinner) prefers not to refactor non-ideal async patterns; leaving block_on in Drop for FlakyRelay is acceptable. Treat similar test-only cleanup suggestions as non-blocking unless they cause flakes/panics.
📚 Learning: 2025-08-08T08:41:38.069Z
Learnt from: ThetaSinner
Repo: holochain/tx5 PR: 167
File: crates/tx5/tests/tests/flaky_sig.rs:73-84
Timestamp: 2025-08-08T08:41:38.069Z
Learning: Repo holochain/tx5: In test code (crates/tx5/tests/tests/flaky_sig.rs), maintainer (ThetaSinner) prefers not to refactor non-ideal async patterns; leaving block_on in Drop for FlakyRelay is acceptable. Treat similar test-only cleanup suggestions as non-blocking unless they cause flakes/panics.
Applied to files:
crates/tx5-connection/src/conn.rscrates/tx5/src/peer.rs
🧬 Code graph analysis (2)
crates/tx5-connection/src/conn.rs (1)
crates/tx5/src/peer.rs (2)
drop(25-27)drop(180-218)
crates/tx5/src/peer.rs (2)
crates/tx5/src/sig.rs (3)
task(171-265)drop(18-20)drop(151-166)crates/tx5-connection/src/conn.rs (3)
drop(49-60)drop(248-250)drop(431-433)
🔇 Additional comments (2)
crates/tx5-connection/src/conn.rs (1)
241-251: RAII guard forreadylooks correct and closes remaining hang pathsUsing
DropReadyto callready.close()whencon_taskexits (including early returns and aborts) is a solid way to ensureConn::ready()futures don’t hang indefinitely if the handshake or webrtc negotiation fails. The extra close from the guard is consistent with the existing explicitready.close()calls on success/fallback, and doesn’t introduce new blocking or ordering problems as it runs only on task teardown.Also applies to: 258-261
crates/tx5/src/peer.rs (1)
192-217: Disconnect send viaDropPeerwith timeout matches desired cleanup orderingHaving
DropPeer::drop(a) calldrop_peer_urlwhile holding the EpInner lock, (b) markreadyfailed, and only then (c) spawn a timed attempt to sendEndpointEvent::Disconnectedensures the endpoint is notified after the peer is removed from internal state, which is exactly what this PR is aiming for. The 10s timeout aroundevt_send.sendin the spawned task prevents a slow or stuck event consumer from effectively “leaking” the cleanup task while keepingDropPeer::dropitself non-blocking.
ThetaSinner
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the change to peer.rs makes sense, I don't think the change to conn.rs does. Does one work without the other? Dealing with the ready success/failure states is a bigger change
neonphog
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's worth spending a little time to attempt making the ready return more semantic results. If that turns out to be a rabbit hole, I'm pretty comfortable with these changes, maybe just adding a comment that closing the ready without somehow better indicating it is an error case is not ideal.
ccc7853 to
8133256
Compare
|
For the sake of not putting too much time into tx5 edge case issues, I'm going to keep in the This significantly reduces test flakes such that it shouldn't disrupt our productivity, so good enough to move forward. |
… Disconnect event is always sent after peer is removed from peer_map
8133256 to
fbc8b30
Compare
|
✔️ fbc8b30 - Conventional commits check succeeded. |
This is 2 changes to reduce test flakes of
sending_second_message_after_remote_disconnect_succeeds_after_disconnect_event.This PR seems to eliminate the flakes completely. On
mainit consistently failed with 10 test runs. In this PR it does not arise within 500 test runs with either backend. The test run that reproduced it wassending_second_message_after_remote_disconnect_succeeds_after_disconnect_event.The changes are explained in inline comments.
Summary by CodeRabbit