-
Notifications
You must be signed in to change notification settings - Fork 127
feat: Upstream Arc-related changes #1375
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…or the same value * test: Add tests for mismatched polka certificates * fix: Check that we have a polka certificate whose value matches the re-proposal or precommit * fix: Check value of polka certificate against proposal when determining whether there is a polka for the current round * fix: Check value of polka certificate against proposal when determining whether there is a polka for a previous round * fix: Update doc comments for `multiplex_proposal` * chore: Rename `polka_certificate_at_round_and_value` to `polka_certificate` for consistency with `commit_certificate` method * test: Fix tests to check for expected correct behavior * chore: Rename tests and fix step description Co-authored-by: Daniel <daniel.cason@circle.com> * chore: Adapt `commit_certificate` for consistency * test: Simplify `no_prevote_for_proposal_with_polka_for_different_value` test * test: Simplify `no_prevote_for_invalid_proposal_with_polka_for_different_value` test * chore: Formatting * test: Fix doc for `no_prevote_for_invalid_proposal_with_polka_for_different_value` * chore: Remove unused `polka_certificates` driver method * chore: Improve error description --------- Co-authored-by: Daniel <daniel.cason@circle.com>
…tput on step change * driver: added failing test for multiplex_step_change * driver: also check for polkas in multiplex_step_change
…module Closes: #XXX Also increases the default maximum number of inbound and outbound peers to 50. --- ### PR author checklist #### For all contributors - [ ] Reference a GitHub issue - [ ] Ensure the PR title follows the [conventional commits][conv-commits] spec - [ ] Add a release note in [`RELEASE_NOTES.md`](/RELEASE_NOTES.md) if the change warrants it - [ ] Add an entry in [`BREAKING_CHANGES.md`](/BREAKING_CHANGES.md) if the change warrants it #### For external contributors - [ ] Maintainers of Malachite are [allowed to push changes to the branch][gh-edit-branch] [conv-commits]: https://www.conventionalcommits.org/en/v1.0.0/#summary [gh-edit-branch]: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests
Closes: #XXX --- ### PR author checklist #### For all contributors - [ ] Reference a GitHub issue - [ ] Ensure the PR title follows the [conventional commits][conv-commits] spec - [ ] Add a release note in [`RELEASE_NOTES.md`](/RELEASE_NOTES.md) if the change warrants it - [ ] Add an entry in [`BREAKING_CHANGES.md`](/BREAKING_CHANGES.md) if the change warrants it #### For external contributors - [ ] Maintainers of Malachite are [allowed to push changes to the branch][gh-edit-branch] [conv-commits]: https://www.conventionalcommits.org/en/v1.0.0/#summary [gh-edit-branch]: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests
Closes: #XXX Add validation checks for incoming value requests to reject invalid or malicious requests early: - Empty ranges - Inverted ranges (start > end) - Ranges beyond our tip height - Ranges exceeding configured batch size Invalid requests now receive an empty response instead of being processed, preventing potential panics, resource exhaustion, or other undefined behavior from malformed peer requests. **Report:** `CS-CIRCLE-MALACHITE-001` **Story:** https://circlepay.atlassian.net/browse/CCHAIN-635 --- ### PR author checklist #### For all contributors - [ ] Reference a GitHub issue - [ ] Ensure the PR title follows the [conventional commits][conv-commits] spec - [ ] Add a release note in [`RELEASE_NOTES.md`](/RELEASE_NOTES.md) if the change warrants it - [ ] Add an entry in [`BREAKING_CHANGES.md`](/BREAKING_CHANGES.md) if the change warrants it #### For external contributors - [ ] Maintainers of Malachite are [allowed to push changes to the branch][gh-edit-branch] [conv-commits]: https://www.conventionalcommits.org/en/v1.0.0/#summary [gh-edit-branch]: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests
…s of peer
Closes: #XXX
```
DEBUG node{moniker=test-0}:network: Connected to peer peer=12D3KooWRp4nPfBam9poFjiN4nrmqeRPHXy5wMBwniQsLr1UaArm connection_id=2 remote_address=/ip4/172.19.0.22/tcp/27000
DEBUG node{moniker=test-0}:network: Connected to peer peer=12D3KooWBowhZZEzjCmVAV8aygVXdHXxaDLvhoYr7Y4G5mZnrjHP connection_id=3 remote_address=/ip4/172.19.0.23/tcp/27000
DEBUG node{moniker=test-0}:network: Connected to peer peer=12D3KooWPh7RweNdso737rpRkWzifbPpKfBDrn5rKdvXyKKRdsJV connection_id=4 remote_address=/ip4/172.19.0.24/tcp/27000
DEBUG node{moniker=test-0}:network: Accepted incoming connection from peer peer=12D3KooWPh7RweNdso737rpRkWzifbPpKfBDrn5rKdvXyKKRdsJV connection_id=6 remote_address=/ip4/172.19.0.24/tcp/36626
DEBUG node{moniker=test-0}:network: Accepted incoming connection from peer peer=12D3KooWBowhZZEzjCmVAV8aygVXdHXxaDLvhoYr7Y4G5mZnrjHP connection_id=5 remote_address=/ip4/172.19.0.23/tcp/42270
DEBUG node{moniker=test-0}:network: Accepted incoming connection from peer peer=12D3KooWRp4nPfBam9poFjiN4nrmqeRPHXy5wMBwniQsLr1UaArm connection_id=7 remote_address=/ip4/172.19.0.22/tcp/57286
DEBUG node{moniker=test-0}:network: Accepted incoming connection from peer peer=12D3KooWDPXmDV2jLxp6mskyQ7BhA9phN7ZYUwaHfjQ1teLtPR46 connection_id=8 remote_address=/ip4/172.19.0.21/tcp/44960
```
---
### PR author checklist
#### For all contributors
- [ ] Reference a GitHub issue
- [ ] Ensure the PR title follows the [conventional
commits][conv-commits] spec
- [ ] Add a release note in [`RELEASE_NOTES.md`](/RELEASE_NOTES.md) if
the change warrants it
- [ ] Add an entry in [`BREAKING_CHANGES.md`](/BREAKING_CHANGES.md) if
the change warrants it
#### For external contributors
- [ ] Maintainers of Malachite are [allowed to push changes to the
branch][gh-edit-branch]
[conv-commits]: https://www.conventionalcommits.org/en/v1.0.0/#summary
[gh-edit-branch]:
https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests
… a polka certificate matching the POL round of a proposal with an invalid value Test units that reveals the following scenario: > Should we vote nil [here](https://github.com/circlefin/malachite/blob/b205f4252f3064d9a74716056f63834ff33f2de9/code/crates/core-driver/src/mux.rs#L281) if the certificate points to an invalid proposal (L32)? Also, realistically this should never happen, right? and the fix, which consist on producing a `RoundInput::InvalidProposalAndPolkaPrevious` consensus state-machine input when receiving a Polka certificate matching the POL round of a proposal with an invalid value. Notice that this is a very unlikely scenario, as this would mean that 2f + 1 processes have accepted the value in a previous round, while the current process rejects it as an invalid value. --- ### PR author checklist #### For all contributors - [ ] Reference a GitHub issue - [ ] Ensure the PR title follows the [conventional commits][conv-commits] spec - [ ] Add a release note in [`RELEASE_NOTES.md`](/RELEASE_NOTES.md) if the change warrants it - [ ] Add an entry in [`BREAKING_CHANGES.md`](/BREAKING_CHANGES.md) if the change warrants it #### For external contributors - [ ] Maintainers of Malachite are [allowed to push changes to the branch][gh-edit-branch] [conv-commits]: https://www.conventionalcommits.org/en/v1.0.0/#summary [gh-edit-branch]: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests --------- Co-authored-by: Romain Ruetschi <romain.ruetschi@circle.com>
…(Invalid --> Valid) Closes: #XXX This addressed the case when the application changes its mind about the validity of a value (e.g. because of a software upgrade). If the validity of a value is not `Valid`, and used to be `Invalid`, there are two code paths in `core-consensus` and the driver that won't update the value in the corresponding keeper. As a result, functions that hit those paths (e.g., `on_proposal`) will then give the false impression that a value, with its validity info has been stored. This has hit an `assert!` in the devnet. N.b.: We also removed some methods that prevented persisting ProposedValue inputs (from the application) to the application when duplicated. This had consequences when the validity changed, so -- to be on the safe side -- we are removing those optimizations and persisting every ProposedValue including its validity to the WAL. **Story:** https://circlepay.atlassian.net/browse/CCHAIN-658 --- - [ ] Reference a GitHub issue - [ ] Ensure the PR title follows the [conventional commits][conv-commits] spec - [ ] Add a release note in [`RELEASE_NOTES.md`](/RELEASE_NOTES.md) if the change warrants it - [ ] Add an entry in [`BREAKING_CHANGES.md`](/BREAKING_CHANGES.md) if the change warrants it - [ ] Maintainers of Malachite are [allowed to push changes to the branch][gh-edit-branch] [conv-commits]: https://www.conventionalcommits.org/en/v1.0.0/#summary [gh-edit-branch]: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests --------- Co-authored-by: Anca Zamfir <zamfiranca@gmail.com> Co-authored-by: Daniel <daniel.cason@circle.com> Co-authored-by: Romain Ruetschi <romain.ruetschi@circle.com>
… pending inputs Test unit confirming this scenario: > According to the spec, PolkaNil a Precommit for nil should be broadcasted whenever a quorum of nil prevotes are observed. However, `PrecommitAny` might shadow `PolkaNil` [here](https://github.com/circlefin/malachite/blob/b205f4252f3064d9a74716056f63834ff33f2de9/code/crates/core-driver/src/mux.rs#L397) (L44 / L47). This prevents the validator from sending out a nil precommit vote when it should. We currently believe the impact is minimal since the existence of `PrecommitAny` should be observable by all validators under the gossip communication model, even if the current validator skips `PolkaNil` broadcast The solution is not optimal, since it allows the production of multiple `PrecommitAny` inputs, therefore the `timeoutPrecommit` is scheduled multiple times per round, which is annoying but not per se an issue. The solution consist on having `find_non_value_threshold` returning a vector of thresholds, instead of a single one. We allow `PolkaNil` to shadow `PolkaAny`, but none of them is shadowed by `PrecommitAny`. --- ### PR author checklist #### For all contributors - [ ] Reference a GitHub issue - [ ] Ensure the PR title follows the [conventional commits][conv-commits] spec - [ ] Add a release note in [`RELEASE_NOTES.md`](/RELEASE_NOTES.md) if the change warrants it - [ ] Add an entry in [`BREAKING_CHANGES.md`](/BREAKING_CHANGES.md) if the change warrants it #### For external contributors - [ ] Maintainers of Malachite are [allowed to push changes to the branch][gh-edit-branch] [conv-commits]: https://www.conventionalcommits.org/en/v1.0.0/#summary [gh-edit-branch]: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests --------- Co-authored-by: Romain Ruetschi <romain.ruetschi@circle.com>
Closes: #XXX This PR introduces batch retrieval of decided values, replacing the single-height `GetDecidedValue` host/app message with a `GetDecidedValues` message that accepts a range of heights. - Replaced `GetDecidedValue` with `GetDecidedValues` throughout the codebase, now taking a range parameter - Added `HeightRangeExt` trait and `HeightRangeInclusiveIterator` for iterating over height ranges with `(height1..=height2).iter_heights()` - Modified the Sync actor to request ranges of decided values in a single call rather than making individual requests per height - Updated `AppMsg` and `HostMsg` to use the new batch API - Updated Starknet host actor to handle batch requests - Updated test app and example implementations to support the new API The previous implementation required one round-trip per decided value during sync operations, which could be inefficient when syncing multiple heights. This change allows fetching multiple decided values in a single request, improving sync performance.⚠️ The consensus engine does NOT validate the requested range. It is the application's responsibility to ensure the range is within valid bounds and handle edge cases appropriately. -- - **Story:** <https://circlepay.atlassian.net/browse/MAL-4> --- - [ ] Reference a GitHub issue - [ ] Ensure the PR title follows the [conventional commits][conv-commits] spec - [ ] Add a release note in [`RELEASE_NOTES.md`](/RELEASE_NOTES.md) if the change warrants it - [ ] Add an entry in [`BREAKING_CHANGES.md`](/BREAKING_CHANGES.md) if the change warrants it - [ ] Maintainers of Malachite are [allowed to push changes to the branch][gh-edit-branch] [conv-commits]: https://www.conventionalcommits.org/en/v1.0.0/#summary [gh-edit-branch]: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests
Closes: #XXX Allow the app to choose the capacity of the request channel, whereby the app can send requests to consensus, e.g., to dump its state. **Story:** <https://circlepay.atlassian.net/browse/CCHAIN-504> --- - [ ] Reference a GitHub issue - [ ] Ensure the PR title follows the [conventional commits][conv-commits] spec - [ ] Add a release note in [`RELEASE_NOTES.md`](/RELEASE_NOTES.md) if the change warrants it - [ ] Add an entry in [`BREAKING_CHANGES.md`](/BREAKING_CHANGES.md) if the change warrants it - [ ] Maintainers of Malachite are [allowed to push changes to the branch][gh-edit-branch] [conv-commits]: https://www.conventionalcommits.org/en/v1.0.0/#summary [gh-edit-branch]: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests
…sabled
Closes: #XXX
With discovery disabled all peers were considered inbound, including the
ones we dialed out. Metrics would be something like these:
```
% curl -s http://localhost:29000/metrics 2>/dev/null | grep "malachitebft_core_consensus_connected_peers"
# HELP malachitebft_core_consensus_connected_peers Number of connected peers, ie. for each consensus node, how many peers is it connected to.
# TYPE malachitebft_core_consensus_connected_peers gauge
malachitebft_core_consensus_connected_peers{moniker="app-0"} 16
% curl -s http://localhost:29000/metrics 2>/dev/null | grep "malachitebft_discovery_num_inbound_peers"
# HELP malachitebft_discovery_num_inbound_peers Number of inbound peers.
# TYPE malachitebft_discovery_num_inbound_peers gauge
malachitebft_discovery_num_inbound_peers{moniker="app-0"} 16
```
After the fix:
```
% curl -s http://localhost:29000/metrics 2>/dev/null | grep "malachitebft_core_consensus_connected_peers"
# HELP malachitebft_core_consensus_connected_peers Number of connected peers, ie. for each consensus node, how many peers is it connected to.
# TYPE malachitebft_core_consensus_connected_peers gauge
malachitebft_core_consensus_connected_peers{moniker="app-0"} 16
% curl -s http://localhost:29000/metrics 2>/dev/null | grep "malachitebft_discovery_num_inbound_peers"
# HELP malachitebft_discovery_num_inbound_peers Number of inbound peers.
# TYPE malachitebft_discovery_num_inbound_peers gauge
malachitebft_discovery_num_inbound_peers{moniker="app-0"} 5
```
Notes:
A peer that is boh oubound and inbound will be counted as outbound only
Also fixes `malachitebft_discovery_num_outound_peers`
In addition the inbound/ outbound connection metric counts are wrong as
counting is based on peer classification rather than actual socket
direction.
```
% curl -s http://localhost:29000/metrics 2>/dev/null | grep "malachitebft_discovery_num_outbound_connections"
# HELP malachitebft_discovery_num_outbound_connections Number of outbound connections.
# TYPE malachitebft_discovery_num_outbound_connections gauge
malachitebft_discovery_num_outbound_connections{moniker="app-0"} 29
% curl -s http://localhost:29000/metrics 2>/dev/null | grep "malachitebft_discovery_num_inbound_connections"
# HELP malachitebft_discovery_num_inbound_connections Number of inbound connections.
# TYPE malachitebft_discovery_num_inbound_connections gauge
malachitebft_discovery_num_inbound_connections{moniker="app-0"} 1
```
After the fix:
```
% curl -s http://localhost:29000/metrics 2>/dev/null | grep "malachitebft_discovery_num_inbound_connections"
# HELP malachitebft_discovery_num_inbound_connections Number of inbound connections.
# TYPE malachitebft_discovery_num_inbound_connections gauge
malachitebft_discovery_num_inbound_connections{moniker="app-0"} 15
% curl -s http://localhost:29000/metrics 2>/dev/null | grep "malachitebft_discovery_num_outbound_connections"
# HELP malachitebft_discovery_num_outbound_connections Number of outbound connections.
# TYPE malachitebft_discovery_num_outbound_connections gauge
malachitebft_discovery_num_outbound_connections{moniker="app-0"} 15
```
---
### PR author checklist
#### For all contributors
- [ ] Reference a GitHub issue
- [ ] Ensure the PR title follows the [conventional
commits][conv-commits] spec
- [ ] Add a release note in [`RELEASE_NOTES.md`](/RELEASE_NOTES.md) if
the change warrants it
- [ ] Add an entry in [`BREAKING_CHANGES.md`](/BREAKING_CHANGES.md) if
the change warrants it
#### For external contributors
- [ ] Maintainers of Malachite are [allowed to push changes to the
branch][gh-edit-branch]
[conv-commits]: https://www.conventionalcommits.org/en/v1.0.0/#summary
[gh-edit-branch]:
https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests
…cking Closes: #XXX ## Summary Foundation for peer scoring monitoring (CCHAIN-667), circlefin#37 Adds detailed Prometheus metrics for tracking peer discovery, scoring, and Gossipsub mesh membership in the network layer. These metrics provide visibility into: - which peers are discovered and their scores - which peers are in the Gossipsub mesh for each topic - network topology and mesh health Previously, network metrics were limited to basic connection counts. This made it difficult to: - debug mesh formation issues (e.g., validators not connecting to each other) - monitor peer scoring behavior (introduced in PR circlefin#37) - identify network partitions - understand which peers are actively participating in consensus gossip ## Changes ### New Metrics #### `malachitebft_network_discovered_peers` Tracks peer scores for all discovered peers (limited to 100 concurrent peers via slot-based tracking). Scores are currently 0, will be enabled with circlefin#37 **Labels:** - `peer_id` - libP2P peer identifier - `peer_moniker` - human-readable peer name - `address` - peer's listen address - `peer_type` - either "persistent_peer" or "full_node". Will add "validator" in a future PR **Value:** peer score (integer), currently 0 #### 2. `malachitebft_network_peer_mesh_membership` Tracks per-peer, per-topic mesh membership status. **Labels:** - `peer_id` - libP2P peer identifier - `peer_moniker` - human-readable peer name - `topic` - gossipsub topic (e.g., "/consensus", "/proposal_parts", "/liveness") **Value:** `1` if peer is in mesh, `0` if not in mesh #### 3. `malachitebft_network_local_node_info` Identifies the local node's information (info-style metric). **Labels:** - `peer_id` - local node's LibP2P peer ID - `listen_addr` - local node's P2P listen address - `moniker` - local node's moniker **Value:** Always `1` ### Code **Peer information storage:** - created `PeerInfo` struct in `state.rs` to store peer data - created `LocalNodeInfo` struct in `state.rs` - `Metrics` maintains a limited `peer_to_slot` mapping for Prometheus **Notes** - **Slot-based tracking** - limits Prometheus cardinality to 100 peer slots to prevent unbounded metric growth - **Separate mesh membership metric** - avoids label growth by not encoding topics in peer score labels - **State-driven updates** - `State` orchestrates updates to both its internal data and metrics during periodic ticks **Stale data handling:** - disconnected peers retain their last known values - slot is freed when peer disconnects, allowing reuse ### Testing Wrote a rudimentary [test_manager.bash](https://gist.github.com/ancazamfir/c64addc5099e43029b2a212c82ffaf7b#file-test_manager-bash) to create a setup with `v` validators and `n` full nodes, stop, start restart one or more nodes. Also [check_mesh.py](https://gist.github.com/ancazamfir/c64addc5099e43029b2a212c82ffaf7b#file-check_mesh-py) script to analyse the mesh structure. ## Future - expose `PeerInfo` and `LocalNodeInfo` via RPC for programmatic access - add alerts for mesh partition detection --- ### PR author checklist #### For all contributors - [ ] Reference a GitHub issue - [ ] Ensure the PR title follows the [conventional commits][conv-commits] spec - [ ] Add a release note in [`RELEASE_NOTES.md`](/RELEASE_NOTES.md) if the change warrants it - [ ] Add an entry in [`BREAKING_CHANGES.md`](/BREAKING_CHANGES.md) if the change warrants it #### For external contributors - [ ] Maintainers of Malachite are [allowed to push changes to the branch][gh-edit-branch] [conv-commits]: https://www.conventionalcommits.org/en/v1.0.0/#summary [gh-edit-branch]: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests --------- Co-authored-by: Romain Ruetschi <romain.ruetschi@circle.com>
## Summary Note: Better reviewed after circlefin#36 is reviewed/ merged Implements application-specific peer scoring for Gossipsub to prioritize nodes based on their types, in mesh formation and maintenance. The scoring mechanism increases the chances that persistent preferentially connect to each other for consensus messages, while still allowing full nodes to participate in the network. Without peer scoring, Gossipsub uses a random mesh formation strategy where all peers have equal priority. This creates problems in mixed networks: - **Validator isolation**: Full nodes can fill meshes before persistent peers/ validators connect to each other - **Suboptimal topology**: Critical consensus messages may route through full nodes unnecessarily, or mesh partitions (eager push falls back to INEED/IWANT pull) Peer scoring solves this by assigning higher scores to persistent peers, making them preferred candidates for mesh membership. ## Implementation Details ### Peer Score Assignment **Score constants** (defined in `code/crates/network/src/lib.rs`): ```rust const PERSISTENT_PEER_SCORE: f64 = 100.0; // Validators: 10,000 after weight const FULL_NODE_SCORE: f64 = 4.0; // Full nodes: 400 after weight const UNKNOWN_PEER_SCORE: f64 = 0.0; // Before Identify completes ``` **Score lifecycle**: 1. **On connection**: peer receives `UNKNOWN_PEER_SCORE = 0.0` immediately to allow initial mesh formation 2. **After Identify**: score upgraded to either: - `PERSISTENT_PEER_SCORE = 100.0` (× 100 weight = 10,000) for validators - `FULL_NODE_SCORE = 4.0` (× 100 weight = 400) for full nodes ### Scoring Parameters **Application-specific scoring** (defined in `code/crates/network/src/behaviour.rs`): ```rust const APP_SPECIFIC_WEIGHT: f64 = 100.0; const OPPORTUNISTIC_GRAFT_THRESHOLD: f64 = 1000.0; ``` **Peer score thresholds**: - `opportunistic_graft_threshold: 1000.0` - peers above this are candidates for opportunistic grafting - `gossip_threshold: -500.0` - peers below this don't receive gossip - `publish_threshold: -1000.0` - peers below this can't publish - `graylist_threshold: -2000.0` - peers below this are completely ignored ### Opportunistic Grafting **Configuration**: ```rust .opportunistic_graft_ticks(3) // try every 3 heartbeats (3 seconds) .opportunistic_graft_peers(2) // graft 2 high-scored peers per cycle ``` **How it works**: - every 3 seconds, Gossipsub checks if the mesh can be imporved - if mesh is full (`mesh_size >= mesh_n`) and low-scored peers exist: - identify peers with scores below `opportunistic_graft_threshold` (1000) - find high-scored peers (>= 1000) not in mesh - prune low-scored peers and graft high-scored ones (max 2 per cycle) **Result**: over time, full nodes (score 400) are gradually replaced by persistent peers/ validators (score 10,000) ### Mesh Formation Behavior **Initial mesh (mesh_size < mesh_n)**: - no score checking - peers join in connection order - full nodes can fill validator meshes if they connect first **Full mesh (mesh_size >= mesh_n)**: - score checking active - opportunistic grafting replaces low-scored peers - nodes preferentially graft other high-scored peers Note: initial mesh formation is fast but potentially suboptiml. Opportunistic grafting gradually optimizes the mesh over time. ### Enable/Disable Peer Scoring **Config file** (`config.toml`): ```toml [consensus.p2p.protocol] type = "gossipsub" mesh_n = 6 mesh_n_high = 8 mesh_n_low = 4 mesh_outbound_min = 2 enable_peer_scoring = false # Default: false ``` overwrite with: ```bash export MALACHITE__CONSENSUS__P2P__PROTOCOL__ENABLE_PEER_SCORING=true ``` default is: `false` (disabled) ## Future Collect and analyse data from different setups. Understand behavior and issues better. More investigation into: - **Gossipsub Spec**: https://github.com/libp2p/specs/blob/master/pubsub/gossipsub/gossipsub-v1.1.md - **Peer Scoring Spec**: https://github.com/libp2p/specs/blob/master/pubsub/gossipsub/gossipsub-v1.1.md#peer-scoring - **libp2p-gossipsub Implementation**: https://docs.rs/libp2p-gossipsub/latest/libp2p_gossipsub/ --- ### PR author checklist #### For all contributors - [ ] Reference a GitHub issue - [ ] Ensure the PR title follows the [conventional commits][conv-commits] spec - [ ] Add a release note in [`RELEASE_NOTES.md`](/RELEASE_NOTES.md) if the change warrants it - [ ] Add an entry in [`BREAKING_CHANGES.md`](/BREAKING_CHANGES.md) if the change warrants it #### For external contributors - [ ] Maintainers of Malachite are [allowed to push changes to the branch][gh-edit-branch] [conv-commits]: https://www.conventionalcommits.org/en/v1.0.0/#summary [gh-edit-branch]: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests --------- Co-authored-by: Romain Ruetschi <romain.ruetschi@circle.com>
Closes: #XXX Signing and verifying proposal parts is the responsibility of the application, and therefore need not be enforced by Malachite Core. The `SignedProposalPart` and `sign_`/`verify_proposal_part` are not used anywhere in the core and can therefore be removed. --- ### PR author checklist #### For all contributors - [ ] Reference a GitHub issue - [ ] Ensure the PR title follows the [conventional commits][conv-commits] spec - [ ] Add a release note in [`RELEASE_NOTES.md`](/RELEASE_NOTES.md) if the change warrants it - [ ] Add an entry in [`BREAKING_CHANGES.md`](/BREAKING_CHANGES.md) if the change warrants it #### For external contributors - [ ] Maintainers of Malachite are [allowed to push changes to the branch][gh-edit-branch] [conv-commits]: https://www.conventionalcommits.org/en/v1.0.0/#summary [gh-edit-branch]: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests
…` trait Closes: #XXX We need this in Arc to sign over arbitrary bytestrings, (eg. block hash) when working with an abstract `SigningProvider`. --- ### PR author checklist #### For all contributors - [ ] Reference a GitHub issue - [ ] Ensure the PR title follows the [conventional commits][conv-commits] spec - [ ] Add a release note in [`RELEASE_NOTES.md`](/RELEASE_NOTES.md) if the change warrants it - [ ] Add an entry in [`BREAKING_CHANGES.md`](/BREAKING_CHANGES.md) if the change warrants it #### For external contributors - [ ] Maintainers of Malachite are [allowed to push changes to the branch][gh-edit-branch] [conv-commits]: https://www.conventionalcommits.org/en/v1.0.0/#summary [gh-edit-branch]: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests
Closes: #XXX
Adds an INFO log every 5 seconds with the network peer info, hopefully
easy to srape
```
2025-12-04T12:52:42.354052Z INFO node{moniker=app-0}:network: Network peer state
Address, Moniker, PeerId, Mesh, Dir, Type, Score
/ip4/127.0.0.1/tcp/27000, app-0, 12D3KooWSzB5AAgv5NJ8mJhJMGSC9t1P4akqhGDeYfSDzW24Ao2i, [/consensus,/liveness,/proposal_parts], me
/ip4/127.0.0.1/tcp/27001, app-1, 12D3KooWNdxorf26eVJyBUZWHJSzBqfEGi7fkuw4u5grsYEtWti3, [/consensus,/liveness,/proposal_parts], outbound, persistent_peer, 0
/ip4/127.0.0.1/tcp/27002, app-2, 12D3KooWBjkfAYHwmqfgauprXn49UEDAaDpjWjfcnLL3rYmTsZm1, [/consensus,/liveness,/proposal_parts], outbound, persistent_peer, 0
/ip4/127.0.0.1/tcp/27003, app-3, 12D3KooWSSYmeYrWT9B51RrHcJZCVusEu1fx5xPFdVo6njSroi8u, [/consensus,/liveness,/proposal_parts], outbound, persistent_peer, 0
/ip4/127.0.0.1/tcp/27004, app-4, 12D3KooWSLSW7ULBgDJfMxLPvySrqKWdRjPCG3FyRUMYyUjXDC9u, [/consensus,/liveness,/proposal_parts], outbound, persistent_peer, 0
/ip4/127.0.0.1/tcp/27005, app-5, 12D3KooWA48ALMPrWsC2QyyBTQjA8urBgsybcyML9F9A6ZtwKSPi, [/consensus,/liveness,/proposal_parts], inbound, full_node, 0
/ip4/127.0.0.1/tcp/27006, app-6, 12D3KooWQ86ffXbt3ufWF7AwHCTssLArS95fwpCLpW5CGZyV3Atf, [/consensus,/liveness,/proposal_parts], inbound, full_node, 0
/ip4/127.0.0.1/tcp/27007, app-7, 12D3KooWLojPb3nva7Y7nPHfJZNDaAV2Tc4uwu6ZXyUf1EmnJC8b, [/consensus,/liveness,/proposal_parts], inbound, full_node, 0
/ip4/127.0.0.1/tcp/27008, app-8, 12D3KooWGAbZK4AotLRtBRK61Ve9Nb5RJCE7tGZVSjJ8AQcHq7QM, [/consensus,/liveness,/proposal_parts], inbound, full_node, 0
/ip4/127.0.0.1/tcp/27009, app-9, 12D3KooWJCMyDBVtDjqWQoMtCfHGGK2H5etoYM1ePZQFimLWHw52, [], inbound, full_node, 0
```
---
Added a check to reject a Sync config that * is enabled * sets the batch size to 0 This is coming from [this comment](https://github.com/circlefin/circle-chain-reth/pull/719/files#r2580390373) Closes: #XXX --- ### PR author checklist #### For all contributors - [ ] Reference a GitHub issue - [ ] Ensure the PR title follows the [conventional commits][conv-commits] spec - [ ] Add a release note in [`RELEASE_NOTES.md`](/RELEASE_NOTES.md) if the change warrants it - [ ] Add an entry in [`BREAKING_CHANGES.md`](/BREAKING_CHANGES.md) if the change warrants it #### For external contributors - [ ] Maintainers of Malachite are [allowed to push changes to the branch][gh-edit-branch] [conv-commits]: https://www.conventionalcommits.org/en/v1.0.0/#summary [gh-edit-branch]: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests
Story: https://circlepay.atlassian.net/browse/CCHAIN-775 Simple implementation: test infra revealing stale data, then fix on `move_to_height` method, clearing all needed fields. An alternative solution is implemented in circlefin#78: merge either this or that one!
When `true`, the node will ONLY connect to persistent peers and accept connection(s) from persistent peers. This option mimics Reth's `--trusted-only` CLI/config option. --- ### PR author checklist #### For all contributors - [ ] Reference a GitHub issue - [x] Ensure the PR title follows the [conventional commits][conv-commits] spec - [x] Add a release note in [`RELEASE_NOTES.md`](/RELEASE_NOTES.md) if the change warrants it - [ ] Add an entry in [`BREAKING_CHANGES.md`](/BREAKING_CHANGES.md) if the change warrants it #### For external contributors - [x] Maintainers of Malachite are [allowed to push changes to the branch][gh-edit-branch] [conv-commits]: https://www.conventionalcommits.org/en/v1.0.0/#summary [gh-edit-branch]: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests
Closes: #XXX --- - [ ] Reference a GitHub issue - [ ] Ensure the PR title follows the [conventional commits][conv-commits] spec - [ ] Add a release note in [`RELEASE_NOTES.md`](/RELEASE_NOTES.md) if the change warrants it - [ ] Add an entry in [`BREAKING_CHANGES.md`](/BREAKING_CHANGES.md) if the change warrants it - [ ] Maintainers of Malachite are [allowed to push changes to the branch][gh-edit-branch] [conv-commits]: https://www.conventionalcommits.org/en/v1.0.0/#summary [gh-edit-branch]: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests
…based application This is a pre-requisite for splitting the network and consensus identities. Best reviewed commit by commit. Story: https://circlepay.atlassian.net/browse/CCHAIN-802
Closes: #XXX
##
### Summary
Fix `unwrap()` in PeerId borsh deserialization, replacing with proper
error handling in `PeerId::deserialize_reader` to avoid panics when
deserializing malformed peer IDs.
### Error Propagation Flow
```
PeerId::deserialize_reader (peer/src/ser/borsh.rs)
│ returns borsh::io::Error with InvalidData kind
v
Status::deserialize_reader (sync/src/ser/borsh.rs:26)
│ propagates via ?
v
Codec::decode
│ returns codec error
v
Network actor (engine/src/network.rs:442-447)
│ catches error, logs, and drops message
v
Node continues running
```
### Security Impact
This prevents a malicious peer from crashing a node by sending an
invalid PeerId in sync messages. The malformed message is logged and
dropped while the node continues to operate.
---
### PR author checklist
#### For all contributors
- [ ] Reference a GitHub issue
- [ ] Ensure the PR title follows the [conventional
commits][conv-commits] spec
- [ ] Add a release note in [`RELEASE_NOTES.md`](/RELEASE_NOTES.md) if
the change warrants it
- [ ] Add an entry in [`BREAKING_CHANGES.md`](/BREAKING_CHANGES.md) if
the change warrants it
#### For external contributors
- [ ] Maintainers of Malachite are [allowed to push changes to the
branch][gh-edit-branch]
[conv-commits]: https://www.conventionalcommits.org/en/v1.0.0/#summary
[gh-edit-branch]:
https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests
Closes: #XXX Fix `unwrap()` in Kademlia address registration. Skip adding peer to Kademlia routing table if listen_addrs is empty, instead of panicking. --- ### PR author checklist #### For all contributors - [ ] Reference a GitHub issue - [ ] Ensure the PR title follows the [conventional commits][conv-commits] spec - [ ] Add a release note in [`RELEASE_NOTES.md`](/RELEASE_NOTES.md) if the change warrants it - [ ] Add an entry in [`BREAKING_CHANGES.md`](/BREAKING_CHANGES.md) if the change warrants it #### For external contributors - [ ] Maintainers of Malachite are [allowed to push changes to the branch][gh-edit-branch] [conv-commits]: https://www.conventionalcommits.org/en/v1.0.0/#summary [gh-edit-branch]: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests
Closes: #XXX --- ### PR author checklist #### For all contributors - [ ] Reference a GitHub issue - [ ] Ensure the PR title follows the [conventional commits][conv-commits] spec - [ ] Add a release note in [`RELEASE_NOTES.md`](/RELEASE_NOTES.md) if the change warrants it - [ ] Add an entry in [`BREAKING_CHANGES.md`](/BREAKING_CHANGES.md) if the change warrants it #### For external contributors - [ ] Maintainers of Malachite are [allowed to push changes to the branch][gh-edit-branch] [conv-commits]: https://www.conventionalcommits.org/en/v1.0.0/#summary [gh-edit-branch]: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests
Closes: #XXX
## Summary
This PR adds libp2p connection limits as a defense-in-depth layer for
resource protection. These limits are enforced at the libp2p layer,
before application-level discovery logic runs.
## Problem
The current discovery-based connection management only enforces limits
**after the Identify protocol completes**. This means an attacker can
consume resources during the connection establishment phase without
being limited.
### Potential Attack Scenarios
These limits help mitigate common connection-based attacks:
- **Handshake Flood**: Open many TCP connections, never complete Noise
handshake → `max_pending_incoming`
- **Slow Handshake**: Complete TCP, send Noise handshake bytes slowly →
`max_pending_incoming` + timeout
- **Pre-Identify Hold**: Complete handshake, never send Identify →
`max_established_incoming`
- **Rapid Reconnect**: Reconnect immediately after being closed →
`max_established_per_peer`
### Connection Lifecycle
```
TCP connect → Noise handshake → Yamux setup → Identify → Discovery logic
↑ ↑ ↑
│ │ └── Current enforcement (discovery)
└── connection_limits enforces here ──────┘
(pending) (established)
```
## Solution
Add `libp2p::connection_limits::Behaviour` to the network stack
The discovery crate's `num_inbound_peers` and `num_outbound_peers`
settings (default 20) remain the primary connection management
mechanism. The new connection limits serve as a safety net:
- Discovery limits: Application-level peer management, enforced after
Identify
- Connection limits: Transport-level protection, enforced during
handshake
The connection limits should be set higher than discovery limits to
allow headroom for ephemeral connections, persistent peers, and
connection churn.
## Testing
- Added unit test `connection_flood_attack()`
---
### PR author checklist
#### For all contributors
- [ ] Reference a GitHub issue
- [ ] Ensure the PR title follows the [conventional
commits][conv-commits] spec
- [ ] Add a release note in [`RELEASE_NOTES.md`](/RELEASE_NOTES.md) if
the change warrants it
- [ ] Add an entry in [`BREAKING_CHANGES.md`](/BREAKING_CHANGES.md) if
the change warrants it
#### For external contributors
- [ ] Maintainers of Malachite are [allowed to push changes to the
branch][gh-edit-branch]
[conv-commits]: https://www.conventionalcommits.org/en/v1.0.0/#summary
[gh-edit-branch]:
https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests
Closes: #XXX `connect_request.rs::can_connect_request()`calls `controller.peer_request` instead of `controller.connect_request` --- ### PR author checklist #### For all contributors - [ ] Reference a GitHub issue - [ ] Ensure the PR title follows the [conventional commits][conv-commits] spec - [ ] Add a release note in [`RELEASE_NOTES.md`](/RELEASE_NOTES.md) if the change warrants it - [ ] Add an entry in [`BREAKING_CHANGES.md`](/BREAKING_CHANGES.md) if the change warrants it #### For external contributors - [ ] Maintainers of Malachite are [allowed to push changes to the branch][gh-edit-branch] [conv-commits]: https://www.conventionalcommits.org/en/v1.0.0/#summary [gh-edit-branch]: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests
Following similar approach to the one for dumping consensus state. Closes: #XXX --- ### PR author checklist #### For all contributors - [ ] Reference a GitHub issue - [ ] Ensure the PR title follows the [conventional commits][conv-commits] spec - [ ] Add a release note in [`RELEASE_NOTES.md`](/RELEASE_NOTES.md) if the change warrants it - [ ] Add an entry in [`BREAKING_CHANGES.md`](/BREAKING_CHANGES.md) if the change warrants it #### For external contributors - [ ] Maintainers of Malachite are [allowed to push changes to the branch][gh-edit-branch] [conv-commits]: https://www.conventionalcommits.org/en/v1.0.0/#summary [gh-edit-branch]: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests --------- Co-authored-by: Romain Ruetschi <romain.ruetschi@circle.com>
…P address Closes: #XXX ## Summary Adding a configurable `max_connections_per_ip` setting to limit the number of inbound connections from a single IP address. This helps prevent DoS attacks where an attacker generates many PeerIds from the same IP. ## Changes ### Configuration Added `max_connections_per_ip` to `DiscoveryConfig`: ```toml [consensus.p2p.discovery] # Maximum connections allowed per IP address. # Limits inbound connections from the same IP to prevent DoS attacks # where an attacker generates many PeerIds from the same IP. # Defaults to num_inbound_peers (effectively disabled). # max_connections_per_ip = 20 ``` **Environment variable**: `MALACHITE__CONSENSUS__P2P__DISCOVERY__MAX_CONNECTIONS_PER_IP` ### Default Behavior - Defaults to `num_inbound_peers`, which effectively disables the per-IP limit - This ensures backward compatibility so existing config files work without changes - To enable the limit, set a value lower than `num_inbound_peers` ### Implementation The enforcement is done in `crates/network/src/ip_limits.rs` via a custom libp2p `NetworkBehaviour`: - Tracks pending and established inbound connections per IP address - Rejects new inbound connections from an IP when the limit is reached - Only limits inbound connections, outbound connections are not tracked or limited ### Tests - Added `crates/network/test/tests/ip_limits.rs` with `per_ip_connection_limit` test - Before adding IP limits, in ab2d2ddf3bf361a93d03daa83a2d8ef29e400794, test fails with: ``` 2025-12-17T18:43:51.651163Z INFO ip_limits: Connected peers: 5 (expected max: 2, attempted: 5) thread 'same_ip_connection_attack' (22093699) panicked at crates/network/test/tests/ip_limits.rs:119:5: Per-IP limit not enforced: 5 connections exceeds limit of 2 ``` - After adding IP limits, in 94456495fdb009a7a758c12831d7d0b0f68fe670, test passes: ``` 2025-12-17T20:16:21.732310Z INFO ip_limits: Connected peers: 2 (expected max: 2, attempted: 5) 2025-12-17T20:16:21.732377Z INFO ip_limits: ✓ Per-IP limit enforced: 2 <= 2 test same_ip_connection_attack ... ok test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.72s ``` - Also tested on local setup Note: Updated `connection_flood.rs` test to explicitly set high IP limit and avoid interference ## Notes - Only **inbound** connections are limited; outbound connections are not affected - This is useful for preventing Sybil attacks from a single IP - Validators behind NAT sharing the same public IP may need a higher limit or explicit configuration - similarly, in local setups that use same IP address for each node, using a higher limit or default is recommended
…cle-chain-reth` These changes are a follow up of circlefin#99, and are coming from addressing some comments in circlefin/circle-chain-reth#768 Closes: #XXX --- ### PR author checklist #### For all contributors - [ ] Reference a GitHub issue - [ ] Ensure the PR title follows the [conventional commits][conv-commits] spec - [ ] Add a release note in [`RELEASE_NOTES.md`](/RELEASE_NOTES.md) if the change warrants it - [ ] Add an entry in [`BREAKING_CHANGES.md`](/BREAKING_CHANGES.md) if the change warrants it #### For external contributors - [ ] Maintainers of Malachite are [allowed to push changes to the branch][gh-edit-branch] [conv-commits]: https://www.conventionalcommits.org/en/v1.0.0/#summary [gh-edit-branch]: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests
…nnection from same peer
Closes: #XXX
## Summary
When receiving an inbound connection from a peer, the old code would
incorrectly cancel all pending outgoing dials to that same peer.
The logic was:
```rust
if self.controller.dial.remove_in_progress(&connection_id).is_none() {
// Intended as defensive cleanup, but triggered on every inbound connection
self.controller.dial_remove_matching_in_progress_connections(&peer_id);
}
```
For inbound connections, `connection_id` is never in our dial tracking
(we didn't initiate the dial), so `remove_in_progress` always returns
`None`, triggering the outgoing dial cleanup. This casues:
- Lost dial tracking: outgoing dial attempts are removed from
in_progress tracking
- Broken retries: if an outgoing dial later fails, retry logic can't
find the dial data
Instead we should only clean up the specific `connection_id` that
completed. For inbound connections, `remove_in_progress` returns `None`
and we don't need to do anything.
---
### PR author checklist
#### For all contributors
- [ ] Reference a GitHub issue
- [ ] Ensure the PR title follows the [conventional
commits][conv-commits] spec
- [ ] Add a release note in [`RELEASE_NOTES.md`](/RELEASE_NOTES.md) if
the change warrants it
- [ ] Add an entry in [`BREAKING_CHANGES.md`](/BREAKING_CHANGES.md) if
the change warrants it
#### For external contributors
- [ ] Maintainers of Malachite are [allowed to push changes to the
branch][gh-edit-branch]
[conv-commits]: https://www.conventionalcommits.org/en/v1.0.0/#summary
[gh-edit-branch]:
https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests
…nabled Closes: #XXX ## Summary Prevent address poisoning when discovery is enabled ## Problem [PR circlefin#1060](circlefin#1060) ("Connection upgrades and multiaddr issues") introduced a vulnerability where all addresses from `dial_data.listen_addrs` are registered in `done_on` when dialing a peer. A malicious peer can exploit this by: 1. Claiming victim addresses in their peer exchange response 2. We register those addresses as "done" when processing the malicious peer's info 3. Later, when real victim info arrives, their addresses are already in `done_on` 4. Victim becomes unreachable This only affects discovery-enabled nodes. When discovery is disabled, peer info comes from trusted config only. ## Fix For discovered peers (untrusted), only register `peer_id` in `done_on`, not addresses: Note: Peer-ID poisoning is still possible: attacker sends fake `(victim_peer_id, [garbage_addr])`. The victim's peer_id gets marked as "done" before we realize the info was wrong. This will be addressed in a future PR using signed peer records, where peer info is cryptographically signed by the peer's private key, preventing impersonation of other peer IDs. ## Testing ``` cargo test -p informalsystems-malachitebft-discovery -- controller::tests ``` Note: Unit tests verify the fix at the controller level. Proper end-to-end tests (simulating a malicious node injecting poisoned peer info, addresses or peer-id) require extending the network test framework to support byzantine fault injection at the discovery/peer-exchange layer. We will consider doing this in a future PR. --- ### PR author checklist #### For all contributors - [ ] Reference a GitHub issue - [ ] Ensure the PR title follows the [conventional commits][conv-commits] spec - [ ] Add a release note in [`RELEASE_NOTES.md`](/RELEASE_NOTES.md) if the change warrants it - [ ] Add an entry in [`BREAKING_CHANGES.md`](/BREAKING_CHANGES.md) if the change warrants it #### For external contributors - [ ] Maintainers of Malachite are [allowed to push changes to the branch][gh-edit-branch] [conv-commits]: https://www.conventionalcommits.org/en/v1.0.0/#summary [gh-edit-branch]: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests
…atus ticker The preliminary study on CCHAIN-491 (see Jira ticket) shows that the sending of status messages are _clustered_ even after 20 mins of testnet startup time. Therefore, we need to do something to change that situation. This PR contains the least invasive solution I could think of. Let us consider the the _ticker_ that controls the sending of the "Broadcasting status" messages. Rather than applying some (random?) jitter every time the ticker sleeps between two consecutive messages, we simply apply a **one-time** adaptation to the ticker's period, which we then keep constant for the rest of the process' lifetime. The adjustment is random per-node: it's a uniform random distribution in the interval [-ADJ, +ADJ]. TODO: Finish the measurements allowing to fine-tune the amount of the adjustment ADJ. Closes: #XXX --- ### PR author checklist #### For all contributors - [ ] Reference a GitHub issue - [ ] Ensure the PR title follows the [conventional commits][conv-commits] spec - [ ] Add a release note in [`RELEASE_NOTES.md`](/RELEASE_NOTES.md) if the change warrants it - [ ] Add an entry in [`BREAKING_CHANGES.md`](/BREAKING_CHANGES.md) if the change warrants it #### For external contributors - [ ] Maintainers of Malachite are [allowed to push changes to the branch][gh-edit-branch] [conv-commits]: https://www.conventionalcommits.org/en/v1.0.0/#summary [gh-edit-branch]: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests --------- Co-authored-by: Romain Ruetschi <romain.ruetschi@circle.com>
This PR prevents address spoofing by using actual connection addresses instead of self-reported addresses from the identify protocol. Problem Previously, `is_persistent_peer_by_address` used addresses from `identify::Info.listen_addrs`, which are self-reported by the remote peer. A malicious node could claim to be listening on a persistent peer's address. Solution Use the actual remote address from the established connection, which cannot be spoofed: - outbound connections: we initiate the dial to a known address, so the remote address is what we dial - inbound connections: the remote address is the actual source of the connection (may include ephemeral port even with port reuse) IMPORTANT The proper long-term solution is to configure persistent peers by PeerId rather than address. PeerId is cryptographically tied to the peer's identity and cannot be spoofed without stealing the private key. NOTE For discovered peers, address spoofing at source is already handled - we dial with `DialOpts::peer_id(peer_id)` which verifies the peer at the address has the expected PeerId. Relay attacks (third party advertising false addresses for a peer) are addressed by the peer records PR which requires signed address records. Persistent peer status Persistent peer status is granted when the connection's remote address matches a configured persistent peer address: - Outbound: We dial the configured address, so match is guaranteed for legitimate peers - Inbound with port reuse: Legitimate peer's source port matches their listen port, so match succeeds - Inbound without port reuse: Ephemeral source port does not match config, peer gets (re)classified as persistent on our subsequent outbound dial to the configured listen address Metric Duplicate Fix The changes above exposed a latent bug where inbound connections (ephemeral port) produce `is_persistent=false` while outbound connections produce `is_persistent=true` for the same validator peer. Both `PeerType` values have different `Hash` but produce the same Prometheus label "validator", causing duplicate metric entries. In `main` the classification happens immediately for peers that dial first (inbound first). Fixed by implementing `Hash` and `PartialEq` based on `primary_type_str()` to match `EncodeLabelValue` behavior.
…on and retry logic Prevent address spoofing vulnerabilities in bootstrap peer detection. - `update_bootstrap_node_peer_id` uses self-reported `identify::Info.listen_addrs` for inbound connections, allowing an attacker to claim they are listening on a bootstrap node's address and get their PeerId registered as that bootstrap node. - `handle_failed_connection` uses `dial_data.listen_addrs()` (which could come from discovery peer exchange) to determine if a failed dial was to a bootstrap node, potentially granting unlimited retries to malicious peers. - In `update_bootstrap_node_peer_id`, for inbound connections we now use the connection remote address from `self.connections` (TCP layer) instead of self-reported identify addresses. - for Outbound, use dial_data addresses (trusted since we initiated the connection) - for Inbound, use the remote address (trusted since it comes from the TCP layer) - Use explicit `is_bootstrap` flag for retry logic. Added `is_bootstrap` field to `DialData` that is set at creation time when dialing configured bootstrap nodes. `handle_failed_connection` now checks this flag instead of comparing addresses. --- - [ ] Reference a GitHub issue - [ ] Ensure the PR title follows the [conventional commits][conv-commits] spec - [ ] Add a release note in [`RELEASE_NOTES.md`](/RELEASE_NOTES.md) if the change warrants it - [ ] Add an entry in [`BREAKING_CHANGES.md`](/BREAKING_CHANGES.md) if the change warrants it - [ ] Maintainers of Malachite are [allowed to push changes to the branch][gh-edit-branch] [conv-commits]: https://www.conventionalcommits.org/en/v1.0.0/#summary [gh-edit-branch]: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests
ams9198
previously approved these changes
Jan 20, 2026
Test around 500k values for the validator set's total power, and verifies whether the 1/3+ and 2/3+ thresholds are computed properly on a range of numbers around (total power/3). We verify the correctness of the threshold computations using different arithmetics, in particular for 2/3+, see details in code comments. For reviewers: the first commit allows for longer (~90s) pseudo-random tests.
…on any WAL-related error
…equiring signed proposal Summary When a node syncs a value from another peer, it receives a commit certificate and the proposed value. Previously, the consensus layer required a signed proposal to complete the decision flow, which meant we had to sign a proposal just to satisfy this requirement. This was wasteful (need to generate signatures for every single sync height) and unnecessary since the commit certificate already proves the value was decided by the network. This PR introduces a new `DecideOnSync` input path that allows sync-originated values to trigger consensus decisions without requiring a signed proposal. Changes The sync decision path now works as follows: 1. The commit certificate is verified and stored when the sync response is received, same as before 2. When the proposed value arrives with sync origin, an unsigned proposal is created internally 3. The driver processes `DecideOnSync` which routes the proposal through the state machine using `ProposalAndPrecommitValue` 4. The `decide()` function detects sync decisions by checking for a pre-existing commit certificate and skips the signed proposal lookups The presence of a commit certificate before `decide()` is called serves as the indicator that this is a sync decision, since in normal consensus the certificate is built from precommits inside `decide()` itself. --------- Co-authored-by: Daniel <daniel.cason@circle.com> Co-authored-by: Romain Ruetschi <romain.ruetschi@circle.com>
Move the inline code in the `Input::ValueResponse` match case to a dedicated function.
bb5d660 to
525a98c
Compare
…n peer exchange (circlefin#105) Closes: #XXX ## Summary The discovery peer exchange protocol is vulnerable to peer ID spoofing. Use signed peer records instead. ## Problem A malicious node could advertise fake `(peer_id, addresses)` tuples, causing honest nodes to: - Associate wrong addresses with legitimate peer IDs - Waste resources dialing unreachable addresses - Mark legitimate peers as "done" before reaching them ## Solution Use libp2p's signed peer records for peer exchange in discovery. Each peer signs their addresses with their private key, cryptographically binding peer ID to addresses. Note: This PR does not address spoofing for peer exchange. A peer can still sign addresses they don't control. Dial attempts to fake addresses fail naturally (`WrongPeerId` or timeout). Combined with peer scoring, repeat offenders could be penalized. This will be addressed in a separate PR. ## Notes on upgrade This in principle is a breaking change as the wire format is incompatible with the previous unsigned peer exchange. The recommended procedure: 1. Disable discovery on all nodes (`discovery.enabled = false`) - This is already the case in testnet, discovery was never enabled 2. Rolling upgrade all nodes to the new version 3. Enable discovery only after all nodes are upgraded --- ### PR author checklist #### For all contributors - [ ] Reference a GitHub issue - [ ] Ensure the PR title follows the [conventional commits][conv-commits] spec - [ ] Add a release note in [`RELEASE_NOTES.md`](/RELEASE_NOTES.md) if the change warrants it - [ ] Add an entry in [`BREAKING_CHANGES.md`](/BREAKING_CHANGES.md) if the change warrants it #### For external contributors - [ ] Maintainers of Malachite are [allowed to push changes to the branch][gh-edit-branch] [conv-commits]: https://www.conventionalcommits.org/en/v1.0.0/#summary [gh-edit-branch]: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests
ancazamfir
reviewed
Jan 22, 2026
Contributor
ancazamfir
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🥳
ancazamfir
approved these changes
Jan 22, 2026
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Closes: #1364
Important
This PR must be merged and NOT squashed