Skip to content

Conversation

@kamilsa
Copy link
Collaborator

@kamilsa kamilsa commented Jan 21, 2026

No description provided.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds comprehensive trace logging throughout the gossip protocol implementation to aid in debugging and monitoring. The changes include adding a logger field to the Gossip class and inserting SL_TRACE statements at key points in the protocol flow.

Changes:

  • Added logger member to Gossip class for trace-level logging
  • Fixed race condition where inbound streams could arrive before peer connection events
  • Added extensive trace logging for connection lifecycle, message handling, mesh operations, and heartbeat diagnostics

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
include/libp2p/protocol/gossip/gossip.hpp Added logger member field to Gossip class
src/protocol/gossip/gossip.cpp Added trace logging throughout gossip protocol functions and fixed race condition in stream handling

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


// Handle IHAVE: select a capped subset of unknown IDs and enqueue IWANTs.
if (pb_message.control().ihave().size() > 0) {
SL_TRACE(logger_, "RX IHAVE count={} from={}", pb_message.control().ihave().size(), peer->peer_id_.toBase58());
Copy link

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line has inconsistent indentation with an extra space. It should align with the surrounding code at the same indentation level.

Suggested change
SL_TRACE(logger_, "RX IHAVE count={} from={}", pb_message.control().ihave().size(), peer->peer_id_.toBase58());
SL_TRACE(logger_, "RX IHAVE count={} from={}", pb_message.control().ihave().size(), peer->peer_id_.toBase58());

Copilot uses AI. Check for mistakes.
Comment on lines 940 to 948
while (true) {
auto self = weak_self.lock();
if (not self) {
break;
}
auto message = qtils::optionTake(peer->batch_);
if (not message) {
break;
}
Copy link

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The restructured loop now checks weak_self.lock() on every iteration, but the locked 'self' variable is never actually used within the loop body. The splitBatch function is a free function that doesn't require the Gossip instance, and all other operations work with the 'peer' shared_ptr that was already captured. This adds unnecessary overhead on each iteration. Consider removing the weak_self check from inside the loop, or only add it if 'self' is actually needed for some operation.

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot open a new pull request to apply changes based on this feedback

// the on_peer_connected event fires (race condition). Create the peer
// entry now to avoid rejecting the stream.
SL_TRACE(logger_, "HANDLE creating missing peer entry for {}", peer_id.toBase58());
peer_it = peers_.emplace(peer_id, std::make_shared<Peer>(peer_id, false)).first;
Copy link

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When creating a peer entry in handle() due to a race condition, the peer is always marked as inbound (out = false). However, it's possible that the underlying connection is actually outbound (we initiated it), and the inbound stream arrived before on_peer_connected fired. This could lead to incorrect mesh topology decisions since the gossip protocol uses the 'out' flag to maintain minimum outbound connections in the mesh and to make GRAFT acceptance decisions. Consider checking the stream's connection to determine the correct value for 'out', or deferring the peer creation until on_peer_connected if possible.

Suggested change
peer_it = peers_.emplace(peer_id, std::make_shared<Peer>(peer_id, false)).first;
// Determine whether this peer should be marked as outbound based on the
// underlying connection (we may have initiated the connection even
// though this is an inbound stream).
bool out = false;
if (auto conn = stream->connection()) {
// If we initiated the connection, this is an outbound peer.
if (conn->isInitiator()) {
out = true;
}
}
peer_it = peers_.emplace(peer_id, std::make_shared<Peer>(peer_id, out)).first;

Copilot uses AI. Check for mistakes.
Copy link

Copilot AI commented Jan 21, 2026

@kamilsa I've opened a new pull request, #23, to work on those changes. Once the pull request is ready, I'll request review from you.

Copilot AI and others added 2 commits January 21, 2026 10:44
Co-authored-by: kamilsa <9370151+kamilsa@users.noreply.github.com>
@kamilsa kamilsa enabled auto-merge (squash) January 21, 2026 05:59
@kamilsa kamilsa merged commit 04a68c6 into main Jan 21, 2026
2 checks passed
@kamilsa kamilsa deleted the tmp/gossip-logs branch January 21, 2026 06:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants