Skip to content

Conversation

@romac
Copy link
Contributor

@romac romac commented Jan 20, 2026

Closes: #1364

Important

This PR must be merged and NOT squashed

romac and others added 30 commits January 20, 2026 08:35
…or the same value

* test: Add tests for mismatched polka certificates

* fix: Check that we have a polka certificate whose value matches the re-proposal or precommit

* fix: Check value of polka certificate against proposal when determining whether there is a polka for the current round

* fix: Check value of polka certificate against proposal when determining whether there is a polka for a previous round

* fix: Update doc comments for `multiplex_proposal`

* chore: Rename `polka_certificate_at_round_and_value` to `polka_certificate` for consistency with `commit_certificate` method

* test: Fix tests to check for expected correct behavior

* chore: Rename tests and fix step description

Co-authored-by: Daniel <daniel.cason@circle.com>

* chore: Adapt `commit_certificate` for consistency

* test: Simplify `no_prevote_for_proposal_with_polka_for_different_value` test

* test: Simplify `no_prevote_for_invalid_proposal_with_polka_for_different_value` test

* chore: Formatting

* test: Fix doc for `no_prevote_for_invalid_proposal_with_polka_for_different_value`

* chore: Remove unused `polka_certificates` driver method

* chore: Improve error description

---------

Co-authored-by: Daniel <daniel.cason@circle.com>
…tput on step change

* driver: added failing test for multiplex_step_change

* driver: also check for polkas in multiplex_step_change
…module

Closes: #XXX

Also increases the default maximum number of inbound and outbound peers
to 50.

---

### PR author checklist

#### For all contributors

- [ ] Reference a GitHub issue
- [ ] Ensure the PR title follows the [conventional
commits][conv-commits] spec
- [ ] Add a release note in [`RELEASE_NOTES.md`](/RELEASE_NOTES.md) if
the change warrants it
- [ ] Add an entry in [`BREAKING_CHANGES.md`](/BREAKING_CHANGES.md) if
the change warrants it

#### For external contributors

- [ ] Maintainers of Malachite are [allowed to push changes to the
branch][gh-edit-branch]

[conv-commits]: https://www.conventionalcommits.org/en/v1.0.0/#summary
[gh-edit-branch]:
https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests
Closes: #XXX

---

### PR author checklist

#### For all contributors

- [ ] Reference a GitHub issue
- [ ] Ensure the PR title follows the [conventional
commits][conv-commits] spec
- [ ] Add a release note in [`RELEASE_NOTES.md`](/RELEASE_NOTES.md) if
the change warrants it
- [ ] Add an entry in [`BREAKING_CHANGES.md`](/BREAKING_CHANGES.md) if
the change warrants it

#### For external contributors

- [ ] Maintainers of Malachite are [allowed to push changes to the
branch][gh-edit-branch]

[conv-commits]: https://www.conventionalcommits.org/en/v1.0.0/#summary
[gh-edit-branch]:
https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests
Closes: #XXX

Add validation checks for incoming value requests to reject invalid or
malicious requests early:

- Empty ranges
- Inverted ranges (start > end)
- Ranges beyond our tip height
- Ranges exceeding configured batch size

Invalid requests now receive an empty response instead of being
processed, preventing potential panics, resource exhaustion, or other
undefined behavior from malformed peer requests.

**Report:** `CS-CIRCLE-MALACHITE-001`

**Story:** https://circlepay.atlassian.net/browse/CCHAIN-635

---

### PR author checklist

#### For all contributors

- [ ] Reference a GitHub issue
- [ ] Ensure the PR title follows the [conventional
commits][conv-commits] spec
- [ ] Add a release note in [`RELEASE_NOTES.md`](/RELEASE_NOTES.md) if
the change warrants it
- [ ] Add an entry in [`BREAKING_CHANGES.md`](/BREAKING_CHANGES.md) if
the change warrants it

#### For external contributors

- [ ] Maintainers of Malachite are [allowed to push changes to the
branch][gh-edit-branch]

[conv-commits]: https://www.conventionalcommits.org/en/v1.0.0/#summary
[gh-edit-branch]:
https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests
…s of peer

Closes: #XXX

```
DEBUG node{moniker=test-0}:network: Connected to peer peer=12D3KooWRp4nPfBam9poFjiN4nrmqeRPHXy5wMBwniQsLr1UaArm connection_id=2 remote_address=/ip4/172.19.0.22/tcp/27000
DEBUG node{moniker=test-0}:network: Connected to peer peer=12D3KooWBowhZZEzjCmVAV8aygVXdHXxaDLvhoYr7Y4G5mZnrjHP connection_id=3 remote_address=/ip4/172.19.0.23/tcp/27000
DEBUG node{moniker=test-0}:network: Connected to peer peer=12D3KooWPh7RweNdso737rpRkWzifbPpKfBDrn5rKdvXyKKRdsJV connection_id=4 remote_address=/ip4/172.19.0.24/tcp/27000
DEBUG node{moniker=test-0}:network: Accepted incoming connection from peer peer=12D3KooWPh7RweNdso737rpRkWzifbPpKfBDrn5rKdvXyKKRdsJV connection_id=6 remote_address=/ip4/172.19.0.24/tcp/36626
DEBUG node{moniker=test-0}:network: Accepted incoming connection from peer peer=12D3KooWBowhZZEzjCmVAV8aygVXdHXxaDLvhoYr7Y4G5mZnrjHP connection_id=5 remote_address=/ip4/172.19.0.23/tcp/42270
DEBUG node{moniker=test-0}:network: Accepted incoming connection from peer peer=12D3KooWRp4nPfBam9poFjiN4nrmqeRPHXy5wMBwniQsLr1UaArm connection_id=7 remote_address=/ip4/172.19.0.22/tcp/57286
DEBUG node{moniker=test-0}:network: Accepted incoming connection from peer peer=12D3KooWDPXmDV2jLxp6mskyQ7BhA9phN7ZYUwaHfjQ1teLtPR46 connection_id=8 remote_address=/ip4/172.19.0.21/tcp/44960
```

---

### PR author checklist

#### For all contributors

- [ ] Reference a GitHub issue
- [ ] Ensure the PR title follows the [conventional
commits][conv-commits] spec
- [ ] Add a release note in [`RELEASE_NOTES.md`](/RELEASE_NOTES.md) if
the change warrants it
- [ ] Add an entry in [`BREAKING_CHANGES.md`](/BREAKING_CHANGES.md) if
the change warrants it

#### For external contributors

- [ ] Maintainers of Malachite are [allowed to push changes to the
branch][gh-edit-branch]

[conv-commits]: https://www.conventionalcommits.org/en/v1.0.0/#summary
[gh-edit-branch]:
https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests
… a polka certificate matching the POL round of a proposal with an invalid value

Test units that reveals the following scenario:

> Should we vote nil
[here](https://github.com/circlefin/malachite/blob/b205f4252f3064d9a74716056f63834ff33f2de9/code/crates/core-driver/src/mux.rs#L281)
if the certificate points to an invalid proposal (L32)? Also,
realistically this should never happen, right?

and the fix, which consist on producing a
`RoundInput::InvalidProposalAndPolkaPrevious` consensus state-machine
input when receiving a Polka certificate matching the POL round of a
proposal with an invalid value.

Notice that this is a very unlikely scenario, as this would mean that 2f
+ 1 processes have accepted the value in a previous round, while the
current process rejects it as an invalid value.

---

### PR author checklist

#### For all contributors

- [ ] Reference a GitHub issue
- [ ] Ensure the PR title follows the [conventional
commits][conv-commits] spec
- [ ] Add a release note in [`RELEASE_NOTES.md`](/RELEASE_NOTES.md) if
the change warrants it
- [ ] Add an entry in [`BREAKING_CHANGES.md`](/BREAKING_CHANGES.md) if
the change warrants it

#### For external contributors

- [ ] Maintainers of Malachite are [allowed to push changes to the
branch][gh-edit-branch]

[conv-commits]: https://www.conventionalcommits.org/en/v1.0.0/#summary
[gh-edit-branch]:
https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests

---------

Co-authored-by: Romain Ruetschi <romain.ruetschi@circle.com>
…(Invalid --> Valid)

Closes: #XXX

This addressed the case when the application changes its mind about the
validity of a value (e.g. because of a software upgrade).

If the validity of a value is not `Valid`, and used to be `Invalid`,
there are two code paths in `core-consensus` and the driver that won't
update the value in the corresponding keeper.
As a result, functions that hit those paths (e.g., `on_proposal`) will
then give the false impression that a value, with its validity info has
been stored.

This has hit an `assert!` in the devnet.

N.b.: We also removed some methods that prevented persisting
ProposedValue inputs (from the application) to the application when
duplicated.
This had consequences when the validity changed, so -- to be on the safe
side -- we are removing those optimizations and persisting every
ProposedValue including its validity to the WAL.

**Story:** https://circlepay.atlassian.net/browse/CCHAIN-658

---

- [ ] Reference a GitHub issue
- [ ] Ensure the PR title follows the [conventional
commits][conv-commits] spec
- [ ] Add a release note in [`RELEASE_NOTES.md`](/RELEASE_NOTES.md) if
the change warrants it
- [ ] Add an entry in [`BREAKING_CHANGES.md`](/BREAKING_CHANGES.md) if
the change warrants it

- [ ] Maintainers of Malachite are [allowed to push changes to the
branch][gh-edit-branch]

[conv-commits]: https://www.conventionalcommits.org/en/v1.0.0/#summary
[gh-edit-branch]:
https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests

---------

Co-authored-by: Anca Zamfir <zamfiranca@gmail.com>
Co-authored-by: Daniel <daniel.cason@circle.com>
Co-authored-by: Romain Ruetschi <romain.ruetschi@circle.com>
… pending inputs

Test unit confirming this scenario:

> According to the spec, PolkaNil a Precommit for nil should be
broadcasted whenever a quorum of nil prevotes are observed. However,
`PrecommitAny` might shadow `PolkaNil`
[here](https://github.com/circlefin/malachite/blob/b205f4252f3064d9a74716056f63834ff33f2de9/code/crates/core-driver/src/mux.rs#L397)
(L44 / L47). This prevents the validator from sending out a nil
precommit vote when it should. We currently believe the impact is
minimal since the existence of `PrecommitAny` should be observable by
all validators under the gossip communication model, even if the current
validator skips `PolkaNil` broadcast

The solution is not optimal, since it allows the production of multiple
`PrecommitAny` inputs, therefore the `timeoutPrecommit` is scheduled
multiple times per round, which is annoying but not per se an issue. The
solution consist on having `find_non_value_threshold` returning a vector
of thresholds, instead of a single one. We allow `PolkaNil` to shadow
`PolkaAny`, but none of them is shadowed by `PrecommitAny`.

---

### PR author checklist

#### For all contributors

- [ ] Reference a GitHub issue
- [ ] Ensure the PR title follows the [conventional
commits][conv-commits] spec
- [ ] Add a release note in [`RELEASE_NOTES.md`](/RELEASE_NOTES.md) if
the change warrants it
- [ ] Add an entry in [`BREAKING_CHANGES.md`](/BREAKING_CHANGES.md) if
the change warrants it

#### For external contributors

- [ ] Maintainers of Malachite are [allowed to push changes to the
branch][gh-edit-branch]

[conv-commits]: https://www.conventionalcommits.org/en/v1.0.0/#summary
[gh-edit-branch]:
https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests

---------

Co-authored-by: Romain Ruetschi <romain.ruetschi@circle.com>
Closes: #XXX

This PR introduces batch retrieval of decided values, replacing the
single-height `GetDecidedValue` host/app message with a
`GetDecidedValues` message that accepts a range of heights.

- Replaced `GetDecidedValue` with `GetDecidedValues` throughout the
codebase, now taking a range parameter
- Added `HeightRangeExt` trait and `HeightRangeInclusiveIterator` for
iterating over height ranges with `(height1..=height2).iter_heights()`
- Modified the Sync actor to request ranges of decided values in a
single call rather than making individual requests per height

- Updated `AppMsg` and `HostMsg` to use the new batch API
- Updated Starknet host actor to handle batch requests
- Updated test app and example implementations to support the new API

The previous implementation required one round-trip per decided value
during sync operations, which could be inefficient when syncing multiple
heights. This change allows fetching multiple decided values in a single
request, improving sync performance.

⚠️ The consensus engine does NOT validate the requested range. It is the
application's responsibility to ensure the range is within valid bounds
and handle edge cases appropriately.

--

- **Story:** <https://circlepay.atlassian.net/browse/MAL-4>

---

- [ ] Reference a GitHub issue
- [ ] Ensure the PR title follows the [conventional
commits][conv-commits] spec
- [ ] Add a release note in [`RELEASE_NOTES.md`](/RELEASE_NOTES.md) if
the change warrants it
- [ ] Add an entry in [`BREAKING_CHANGES.md`](/BREAKING_CHANGES.md) if
the change warrants it

- [ ] Maintainers of Malachite are [allowed to push changes to the
branch][gh-edit-branch]

[conv-commits]: https://www.conventionalcommits.org/en/v1.0.0/#summary
[gh-edit-branch]:
https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests
Closes: #XXX

Allow the app to choose the capacity of the request channel, whereby the
app can send requests to consensus, e.g., to dump its state.

**Story:** <https://circlepay.atlassian.net/browse/CCHAIN-504>

---

- [ ] Reference a GitHub issue
- [ ] Ensure the PR title follows the [conventional
commits][conv-commits] spec
- [ ] Add a release note in [`RELEASE_NOTES.md`](/RELEASE_NOTES.md) if
the change warrants it
- [ ] Add an entry in [`BREAKING_CHANGES.md`](/BREAKING_CHANGES.md) if
the change warrants it

- [ ] Maintainers of Malachite are [allowed to push changes to the
branch][gh-edit-branch]

[conv-commits]: https://www.conventionalcommits.org/en/v1.0.0/#summary
[gh-edit-branch]:
https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests
…sabled

Closes: #XXX

With discovery disabled all peers were considered inbound, including the
ones we dialed out. Metrics would be something like these:
```
% curl -s http://localhost:29000/metrics 2>/dev/null | grep "malachitebft_core_consensus_connected_peers"
# HELP malachitebft_core_consensus_connected_peers Number of connected peers, ie. for each consensus node, how many peers is it connected to.
# TYPE malachitebft_core_consensus_connected_peers gauge
malachitebft_core_consensus_connected_peers{moniker="app-0"} 16

% curl -s http://localhost:29000/metrics 2>/dev/null | grep "malachitebft_discovery_num_inbound_peers"
# HELP malachitebft_discovery_num_inbound_peers Number of inbound peers.
# TYPE malachitebft_discovery_num_inbound_peers gauge
malachitebft_discovery_num_inbound_peers{moniker="app-0"} 16
```

After the fix:
```
% curl -s http://localhost:29000/metrics 2>/dev/null | grep "malachitebft_core_consensus_connected_peers"
# HELP malachitebft_core_consensus_connected_peers Number of connected peers, ie. for each consensus node, how many peers is it connected to.
# TYPE malachitebft_core_consensus_connected_peers gauge
malachitebft_core_consensus_connected_peers{moniker="app-0"} 16

% curl -s http://localhost:29000/metrics 2>/dev/null | grep "malachitebft_discovery_num_inbound_peers"
# HELP malachitebft_discovery_num_inbound_peers Number of inbound peers.
# TYPE malachitebft_discovery_num_inbound_peers gauge
malachitebft_discovery_num_inbound_peers{moniker="app-0"} 5
```
Notes:
A peer that is boh oubound and inbound will be counted as outbound only
Also fixes `malachitebft_discovery_num_outound_peers`

In addition the inbound/ outbound connection metric counts are wrong as
counting is based on peer classification rather than actual socket
direction.

```
% curl -s http://localhost:29000/metrics 2>/dev/null | grep "malachitebft_discovery_num_outbound_connections"
# HELP malachitebft_discovery_num_outbound_connections Number of outbound connections.
# TYPE malachitebft_discovery_num_outbound_connections gauge
malachitebft_discovery_num_outbound_connections{moniker="app-0"} 29

% curl -s http://localhost:29000/metrics 2>/dev/null | grep "malachitebft_discovery_num_inbound_connections"
# HELP malachitebft_discovery_num_inbound_connections Number of inbound connections.
# TYPE malachitebft_discovery_num_inbound_connections gauge
malachitebft_discovery_num_inbound_connections{moniker="app-0"} 1
```
After the fix:
```
 % curl -s http://localhost:29000/metrics 2>/dev/null | grep "malachitebft_discovery_num_inbound_connections"
# HELP malachitebft_discovery_num_inbound_connections Number of inbound connections.
# TYPE malachitebft_discovery_num_inbound_connections gauge
malachitebft_discovery_num_inbound_connections{moniker="app-0"} 15

% curl -s http://localhost:29000/metrics 2>/dev/null | grep "malachitebft_discovery_num_outbound_connections"
# HELP malachitebft_discovery_num_outbound_connections Number of outbound connections.
# TYPE malachitebft_discovery_num_outbound_connections gauge
malachitebft_discovery_num_outbound_connections{moniker="app-0"} 15
```
---

### PR author checklist

#### For all contributors

- [ ] Reference a GitHub issue
- [ ] Ensure the PR title follows the [conventional
commits][conv-commits] spec
- [ ] Add a release note in [`RELEASE_NOTES.md`](/RELEASE_NOTES.md) if
the change warrants it
- [ ] Add an entry in [`BREAKING_CHANGES.md`](/BREAKING_CHANGES.md) if
the change warrants it

#### For external contributors

- [ ] Maintainers of Malachite are [allowed to push changes to the
branch][gh-edit-branch]

[conv-commits]: https://www.conventionalcommits.org/en/v1.0.0/#summary
[gh-edit-branch]:
https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests
…cking

Closes: #XXX

## Summary
Foundation for peer scoring monitoring (CCHAIN-667), circlefin#37
Adds detailed Prometheus metrics for tracking peer discovery, scoring,
and Gossipsub mesh membership in the network layer. These metrics
provide visibility into:
- which peers are discovered and their scores
- which peers are in the Gossipsub mesh for each topic
- network topology and mesh health

Previously, network metrics were limited to basic connection counts.
This made it difficult to:
- debug mesh formation issues (e.g., validators not connecting to each
other)
- monitor peer scoring behavior (introduced in PR circlefin#37)
- identify network partitions
- understand which peers are actively participating in consensus gossip

## Changes

### New Metrics

#### `malachitebft_network_discovered_peers`
Tracks peer scores for all discovered peers (limited to 100 concurrent
peers via slot-based tracking). Scores are currently 0, will be enabled
with circlefin#37

**Labels:**
- `peer_id` - libP2P peer identifier
- `peer_moniker` - human-readable peer name
- `address` - peer's listen address
- `peer_type` - either "persistent_peer" or "full_node". Will add
"validator" in a future PR

**Value:** peer score (integer), currently 0

#### 2. `malachitebft_network_peer_mesh_membership`
Tracks per-peer, per-topic mesh membership status.

**Labels:**
- `peer_id` - libP2P peer identifier
- `peer_moniker` - human-readable peer name
- `topic` - gossipsub topic (e.g., "/consensus", "/proposal_parts",
"/liveness")

**Value:** `1` if peer is in mesh, `0` if not in mesh

#### 3. `malachitebft_network_local_node_info`
Identifies the local node's information (info-style metric).

**Labels:**
- `peer_id` - local node's LibP2P peer ID
- `listen_addr` - local node's P2P listen address
- `moniker` - local node's moniker

**Value:** Always `1`

### Code

**Peer information storage:**
- created `PeerInfo` struct in `state.rs` to store peer data
- created `LocalNodeInfo` struct in `state.rs`
- `Metrics` maintains a limited `peer_to_slot` mapping for Prometheus

**Notes**
- **Slot-based tracking** - limits Prometheus cardinality to 100 peer
slots to prevent unbounded metric growth
- **Separate mesh membership metric** - avoids label growth by not
encoding topics in peer score labels
- **State-driven updates** - `State` orchestrates updates to both its
internal data and metrics during periodic ticks

**Stale data handling:**
- disconnected peers retain their last known values
- slot is freed when peer disconnects, allowing reuse

### Testing
Wrote a rudimentary
[test_manager.bash](https://gist.github.com/ancazamfir/c64addc5099e43029b2a212c82ffaf7b#file-test_manager-bash)
to create a setup with `v` validators and `n` full nodes, stop, start
restart one or more nodes.

Also
[check_mesh.py](https://gist.github.com/ancazamfir/c64addc5099e43029b2a212c82ffaf7b#file-check_mesh-py)
script to analyse the mesh structure.

## Future

- expose `PeerInfo` and `LocalNodeInfo` via RPC for programmatic access
- add alerts for mesh partition detection

---

### PR author checklist

#### For all contributors

- [ ] Reference a GitHub issue
- [ ] Ensure the PR title follows the [conventional
commits][conv-commits] spec
- [ ] Add a release note in [`RELEASE_NOTES.md`](/RELEASE_NOTES.md) if
the change warrants it
- [ ] Add an entry in [`BREAKING_CHANGES.md`](/BREAKING_CHANGES.md) if
the change warrants it

#### For external contributors

- [ ] Maintainers of Malachite are [allowed to push changes to the
branch][gh-edit-branch]

[conv-commits]: https://www.conventionalcommits.org/en/v1.0.0/#summary
[gh-edit-branch]:
https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests

---------

Co-authored-by: Romain Ruetschi <romain.ruetschi@circle.com>
## Summary

Note: Better reviewed after circlefin#36 is reviewed/ merged

Implements application-specific peer scoring for Gossipsub to prioritize
nodes based on their types, in mesh formation and maintenance. The
scoring mechanism increases the chances that persistent preferentially
connect to each other for consensus messages, while still allowing full
nodes to participate in the network.

Without peer scoring, Gossipsub uses a random mesh formation strategy
where all peers have equal priority. This creates problems in mixed
networks:

- **Validator isolation**: Full nodes can fill meshes before persistent
peers/ validators connect to each other
- **Suboptimal topology**: Critical consensus messages may route through
full nodes unnecessarily, or mesh partitions (eager push falls back to
INEED/IWANT pull)

Peer scoring solves this by assigning higher scores to persistent peers,
making them preferred candidates for mesh membership.

## Implementation Details

### Peer Score Assignment

**Score constants** (defined in `code/crates/network/src/lib.rs`):
```rust
const PERSISTENT_PEER_SCORE: f64 = 100.0;  // Validators: 10,000 after weight
const FULL_NODE_SCORE: f64 = 4.0;           // Full nodes: 400 after weight
const UNKNOWN_PEER_SCORE: f64 = 0.0;        // Before Identify completes
```

**Score lifecycle**:
1. **On connection**: peer receives `UNKNOWN_PEER_SCORE = 0.0`
immediately to allow initial mesh formation
2. **After Identify**: score upgraded to either:
- `PERSISTENT_PEER_SCORE = 100.0` (× 100 weight = 10,000) for validators
   - `FULL_NODE_SCORE = 4.0` (× 100 weight = 400) for full nodes

### Scoring Parameters

**Application-specific scoring** (defined in
`code/crates/network/src/behaviour.rs`):
```rust
const APP_SPECIFIC_WEIGHT: f64 = 100.0;
const OPPORTUNISTIC_GRAFT_THRESHOLD: f64 = 1000.0;
```

**Peer score thresholds**:
- `opportunistic_graft_threshold: 1000.0` - peers above this are
candidates for opportunistic grafting
- `gossip_threshold: -500.0` - peers below this don't receive gossip
- `publish_threshold: -1000.0` - peers below this can't publish
- `graylist_threshold: -2000.0` - peers below this are completely
ignored

### Opportunistic Grafting

**Configuration**:
```rust
.opportunistic_graft_ticks(3)  // try every 3 heartbeats (3 seconds)
.opportunistic_graft_peers(2)   // graft 2 high-scored peers per cycle
```

**How it works**:
- every 3 seconds, Gossipsub checks if the mesh can be imporved
- if mesh is full (`mesh_size >= mesh_n`) and low-scored peers exist:
- identify peers with scores below `opportunistic_graft_threshold`
(1000)
   - find high-scored peers (>= 1000) not in mesh
   - prune low-scored peers and graft high-scored ones (max 2 per cycle)

**Result**: over time, full nodes (score 400) are gradually replaced by
persistent peers/ validators (score 10,000)

### Mesh Formation Behavior

**Initial mesh (mesh_size < mesh_n)**:
- no score checking
- peers join in connection order
- full nodes can fill validator meshes if they connect first

**Full mesh (mesh_size >= mesh_n)**:
- score checking active
- opportunistic grafting replaces low-scored peers
- nodes preferentially graft other high-scored peers

Note: initial mesh formation is fast but potentially suboptiml.
Opportunistic grafting gradually optimizes the mesh over time.

### Enable/Disable Peer Scoring

**Config file** (`config.toml`):
```toml
[consensus.p2p.protocol]
type = "gossipsub"
mesh_n = 6
mesh_n_high = 8
mesh_n_low = 4
mesh_outbound_min = 2
enable_peer_scoring = false  # Default: false
```
overwrite with:
```bash
export MALACHITE__CONSENSUS__P2P__PROTOCOL__ENABLE_PEER_SCORING=true
```
default is: `false` (disabled)

## Future
Collect and analyse data from different setups. Understand behavior and
issues better.
More investigation into:
- **Gossipsub Spec**:
https://github.com/libp2p/specs/blob/master/pubsub/gossipsub/gossipsub-v1.1.md
- **Peer Scoring Spec**:
https://github.com/libp2p/specs/blob/master/pubsub/gossipsub/gossipsub-v1.1.md#peer-scoring
- **libp2p-gossipsub Implementation**:
https://docs.rs/libp2p-gossipsub/latest/libp2p_gossipsub/

---

### PR author checklist

#### For all contributors

- [ ] Reference a GitHub issue
- [ ] Ensure the PR title follows the [conventional
commits][conv-commits] spec
- [ ] Add a release note in [`RELEASE_NOTES.md`](/RELEASE_NOTES.md) if
the change warrants it
- [ ] Add an entry in [`BREAKING_CHANGES.md`](/BREAKING_CHANGES.md) if
the change warrants it

#### For external contributors

- [ ] Maintainers of Malachite are [allowed to push changes to the
branch][gh-edit-branch]

[conv-commits]: https://www.conventionalcommits.org/en/v1.0.0/#summary
[gh-edit-branch]:
https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests

---------

Co-authored-by: Romain Ruetschi <romain.ruetschi@circle.com>
Closes: #XXX

Signing and verifying proposal parts is the responsibility of the
application, and therefore need not be enforced by Malachite Core. The
`SignedProposalPart` and `sign_`/`verify_proposal_part` are not used
anywhere in the core and can therefore be removed.

---

### PR author checklist

#### For all contributors

- [ ] Reference a GitHub issue
- [ ] Ensure the PR title follows the [conventional
commits][conv-commits] spec
- [ ] Add a release note in [`RELEASE_NOTES.md`](/RELEASE_NOTES.md) if
the change warrants it
- [ ] Add an entry in [`BREAKING_CHANGES.md`](/BREAKING_CHANGES.md) if
the change warrants it

#### For external contributors

- [ ] Maintainers of Malachite are [allowed to push changes to the
branch][gh-edit-branch]

[conv-commits]: https://www.conventionalcommits.org/en/v1.0.0/#summary
[gh-edit-branch]:
https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests
…` trait

Closes: #XXX

We need this in Arc to sign over arbitrary bytestrings, (eg. block hash)
when working with an abstract `SigningProvider`.

---

### PR author checklist

#### For all contributors

- [ ] Reference a GitHub issue
- [ ] Ensure the PR title follows the [conventional
commits][conv-commits] spec
- [ ] Add a release note in [`RELEASE_NOTES.md`](/RELEASE_NOTES.md) if
the change warrants it
- [ ] Add an entry in [`BREAKING_CHANGES.md`](/BREAKING_CHANGES.md) if
the change warrants it

#### For external contributors

- [ ] Maintainers of Malachite are [allowed to push changes to the
branch][gh-edit-branch]

[conv-commits]: https://www.conventionalcommits.org/en/v1.0.0/#summary
[gh-edit-branch]:
https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests
Closes: #XXX
Adds an INFO log every 5 seconds with the network peer info, hopefully
easy to srape

```
2025-12-04T12:52:42.354052Z  INFO node{moniker=app-0}:network: Network peer state
Address, Moniker, PeerId, Mesh, Dir, Type, Score
/ip4/127.0.0.1/tcp/27000, app-0, 12D3KooWSzB5AAgv5NJ8mJhJMGSC9t1P4akqhGDeYfSDzW24Ao2i, [/consensus,/liveness,/proposal_parts], me
/ip4/127.0.0.1/tcp/27001, app-1, 12D3KooWNdxorf26eVJyBUZWHJSzBqfEGi7fkuw4u5grsYEtWti3, [/consensus,/liveness,/proposal_parts], outbound, persistent_peer, 0
/ip4/127.0.0.1/tcp/27002, app-2, 12D3KooWBjkfAYHwmqfgauprXn49UEDAaDpjWjfcnLL3rYmTsZm1, [/consensus,/liveness,/proposal_parts], outbound, persistent_peer, 0
/ip4/127.0.0.1/tcp/27003, app-3, 12D3KooWSSYmeYrWT9B51RrHcJZCVusEu1fx5xPFdVo6njSroi8u, [/consensus,/liveness,/proposal_parts], outbound, persistent_peer, 0
/ip4/127.0.0.1/tcp/27004, app-4, 12D3KooWSLSW7ULBgDJfMxLPvySrqKWdRjPCG3FyRUMYyUjXDC9u, [/consensus,/liveness,/proposal_parts], outbound, persistent_peer, 0
/ip4/127.0.0.1/tcp/27005, app-5, 12D3KooWA48ALMPrWsC2QyyBTQjA8urBgsybcyML9F9A6ZtwKSPi, [/consensus,/liveness,/proposal_parts], inbound, full_node, 0
/ip4/127.0.0.1/tcp/27006, app-6, 12D3KooWQ86ffXbt3ufWF7AwHCTssLArS95fwpCLpW5CGZyV3Atf, [/consensus,/liveness,/proposal_parts], inbound, full_node, 0
/ip4/127.0.0.1/tcp/27007, app-7, 12D3KooWLojPb3nva7Y7nPHfJZNDaAV2Tc4uwu6ZXyUf1EmnJC8b, [/consensus,/liveness,/proposal_parts], inbound, full_node, 0
/ip4/127.0.0.1/tcp/27008, app-8, 12D3KooWGAbZK4AotLRtBRK61Ve9Nb5RJCE7tGZVSjJ8AQcHq7QM, [/consensus,/liveness,/proposal_parts], inbound, full_node, 0
/ip4/127.0.0.1/tcp/27009, app-9, 12D3KooWJCMyDBVtDjqWQoMtCfHGGK2H5etoYM1ePZQFimLWHw52, [], inbound, full_node, 0
```
---
Added a check to reject a Sync config that
* is enabled
* sets the batch size to 0

This is coming from [this
comment](https://github.com/circlefin/circle-chain-reth/pull/719/files#r2580390373)

Closes: #XXX

---

### PR author checklist

#### For all contributors

- [ ] Reference a GitHub issue
- [ ] Ensure the PR title follows the [conventional
commits][conv-commits] spec
- [ ] Add a release note in [`RELEASE_NOTES.md`](/RELEASE_NOTES.md) if
the change warrants it
- [ ] Add an entry in [`BREAKING_CHANGES.md`](/BREAKING_CHANGES.md) if
the change warrants it

#### For external contributors

- [ ] Maintainers of Malachite are [allowed to push changes to the
branch][gh-edit-branch]

[conv-commits]: https://www.conventionalcommits.org/en/v1.0.0/#summary
[gh-edit-branch]:
https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests
Story: https://circlepay.atlassian.net/browse/CCHAIN-775

Simple implementation: test infra revealing stale data, then fix on
`move_to_height` method, clearing all needed fields.

An alternative solution is implemented in circlefin#78: merge either this or that
one!
When `true`, the node will ONLY connect to persistent peers and accept
connection(s) from persistent peers.

This option mimics Reth's `--trusted-only` CLI/config option.

---

### PR author checklist

#### For all contributors

- [ ] Reference a GitHub issue
- [x] Ensure the PR title follows the [conventional
commits][conv-commits] spec
- [x] Add a release note in [`RELEASE_NOTES.md`](/RELEASE_NOTES.md) if
the change warrants it
- [ ] Add an entry in [`BREAKING_CHANGES.md`](/BREAKING_CHANGES.md) if
the change warrants it

#### For external contributors

- [x] Maintainers of Malachite are [allowed to push changes to the
branch][gh-edit-branch]

[conv-commits]: https://www.conventionalcommits.org/en/v1.0.0/#summary
[gh-edit-branch]:
https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests
Closes: #XXX

---

- [ ] Reference a GitHub issue
- [ ] Ensure the PR title follows the [conventional
commits][conv-commits] spec
- [ ] Add a release note in [`RELEASE_NOTES.md`](/RELEASE_NOTES.md) if
the change warrants it
- [ ] Add an entry in [`BREAKING_CHANGES.md`](/BREAKING_CHANGES.md) if
the change warrants it

- [ ] Maintainers of Malachite are [allowed to push changes to the
branch][gh-edit-branch]

[conv-commits]: https://www.conventionalcommits.org/en/v1.0.0/#summary
[gh-edit-branch]:
https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests
…based application

This is a pre-requisite for splitting the network and consensus
identities.

Best reviewed commit by commit.

Story: https://circlepay.atlassian.net/browse/CCHAIN-802
Closes: #XXX

##

### Summary
Fix `unwrap()` in PeerId borsh deserialization, replacing with proper
error handling in `PeerId::deserialize_reader` to avoid panics when
deserializing malformed peer IDs.

### Error Propagation Flow

```
PeerId::deserialize_reader (peer/src/ser/borsh.rs)
    │ returns borsh::io::Error with InvalidData kind
    v
Status::deserialize_reader (sync/src/ser/borsh.rs:26)
    │ propagates via ?
    v
Codec::decode
    │ returns codec error
    v
Network actor (engine/src/network.rs:442-447)
    │ catches error, logs, and drops message
    v
Node continues running
```
### Security Impact
This prevents a malicious peer from crashing a node by sending an
invalid PeerId in sync messages. The malformed message is logged and
dropped while the node continues to operate.

---

### PR author checklist

#### For all contributors

- [ ] Reference a GitHub issue
- [ ] Ensure the PR title follows the [conventional
commits][conv-commits] spec
- [ ] Add a release note in [`RELEASE_NOTES.md`](/RELEASE_NOTES.md) if
the change warrants it
- [ ] Add an entry in [`BREAKING_CHANGES.md`](/BREAKING_CHANGES.md) if
the change warrants it

#### For external contributors

- [ ] Maintainers of Malachite are [allowed to push changes to the
branch][gh-edit-branch]

[conv-commits]: https://www.conventionalcommits.org/en/v1.0.0/#summary
[gh-edit-branch]:
https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests
Closes: #XXX

Fix `unwrap()` in Kademlia address registration. Skip adding peer to
Kademlia routing table if listen_addrs is empty, instead of panicking.

---

### PR author checklist

#### For all contributors

- [ ] Reference a GitHub issue
- [ ] Ensure the PR title follows the [conventional
commits][conv-commits] spec
- [ ] Add a release note in [`RELEASE_NOTES.md`](/RELEASE_NOTES.md) if
the change warrants it
- [ ] Add an entry in [`BREAKING_CHANGES.md`](/BREAKING_CHANGES.md) if
the change warrants it

#### For external contributors

- [ ] Maintainers of Malachite are [allowed to push changes to the
branch][gh-edit-branch]

[conv-commits]: https://www.conventionalcommits.org/en/v1.0.0/#summary
[gh-edit-branch]:
https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests
Closes: #XXX

---

### PR author checklist

#### For all contributors

- [ ] Reference a GitHub issue
- [ ] Ensure the PR title follows the [conventional
commits][conv-commits] spec
- [ ] Add a release note in [`RELEASE_NOTES.md`](/RELEASE_NOTES.md) if
the change warrants it
- [ ] Add an entry in [`BREAKING_CHANGES.md`](/BREAKING_CHANGES.md) if
the change warrants it

#### For external contributors

- [ ] Maintainers of Malachite are [allowed to push changes to the
branch][gh-edit-branch]

[conv-commits]: https://www.conventionalcommits.org/en/v1.0.0/#summary
[gh-edit-branch]:
https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests
Closes: #XXX

## Summary

This PR adds libp2p connection limits as a defense-in-depth layer for
resource protection. These limits are enforced at the libp2p layer,
before application-level discovery logic runs.

## Problem

The current discovery-based connection management only enforces limits
**after the Identify protocol completes**. This means an attacker can
consume resources during the connection establishment phase without
being limited.

### Potential Attack Scenarios

These limits help mitigate common connection-based attacks:

- **Handshake Flood**: Open many TCP connections, never complete Noise
handshake → `max_pending_incoming`
- **Slow Handshake**: Complete TCP, send Noise handshake bytes slowly →
`max_pending_incoming` + timeout
- **Pre-Identify Hold**: Complete handshake, never send Identify →
`max_established_incoming`
- **Rapid Reconnect**: Reconnect immediately after being closed →
`max_established_per_peer`

### Connection Lifecycle

```
TCP connect → Noise handshake → Yamux setup → Identify → Discovery logic
              ↑                              ↑          ↑
              │                              │          └── Current enforcement (discovery)
              └── connection_limits enforces here ──────┘
                  (pending)                  (established)
```

## Solution

Add `libp2p::connection_limits::Behaviour` to the network stack

The discovery crate's `num_inbound_peers` and `num_outbound_peers`
settings (default 20) remain the primary connection management
mechanism. The new connection limits serve as a safety net:

- Discovery limits: Application-level peer management, enforced after
Identify
- Connection limits: Transport-level protection, enforced during
handshake

The connection limits should be set higher than discovery limits to
allow headroom for ephemeral connections, persistent peers, and
connection churn.

## Testing
- Added unit test `connection_flood_attack()`

---

### PR author checklist

#### For all contributors

- [ ] Reference a GitHub issue
- [ ] Ensure the PR title follows the [conventional
commits][conv-commits] spec
- [ ] Add a release note in [`RELEASE_NOTES.md`](/RELEASE_NOTES.md) if
the change warrants it
- [ ] Add an entry in [`BREAKING_CHANGES.md`](/BREAKING_CHANGES.md) if
the change warrants it

#### For external contributors

- [ ] Maintainers of Malachite are [allowed to push changes to the
branch][gh-edit-branch]

[conv-commits]: https://www.conventionalcommits.org/en/v1.0.0/#summary
[gh-edit-branch]:
https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests
Closes: #XXX

`connect_request.rs::can_connect_request()`calls
`controller.peer_request` instead of `controller.connect_request`

---

### PR author checklist

#### For all contributors

- [ ] Reference a GitHub issue
- [ ] Ensure the PR title follows the [conventional
commits][conv-commits] spec
- [ ] Add a release note in [`RELEASE_NOTES.md`](/RELEASE_NOTES.md) if
the change warrants it
- [ ] Add an entry in [`BREAKING_CHANGES.md`](/BREAKING_CHANGES.md) if
the change warrants it

#### For external contributors

- [ ] Maintainers of Malachite are [allowed to push changes to the
branch][gh-edit-branch]

[conv-commits]: https://www.conventionalcommits.org/en/v1.0.0/#summary
[gh-edit-branch]:
https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests
sergio-mena and others added 13 commits January 20, 2026 09:56
Following similar approach to the one for dumping consensus state.

Closes: #XXX

---

### PR author checklist

#### For all contributors

- [ ] Reference a GitHub issue
- [ ] Ensure the PR title follows the [conventional
commits][conv-commits] spec
- [ ] Add a release note in [`RELEASE_NOTES.md`](/RELEASE_NOTES.md) if
the change warrants it
- [ ] Add an entry in [`BREAKING_CHANGES.md`](/BREAKING_CHANGES.md) if
the change warrants it

#### For external contributors

- [ ] Maintainers of Malachite are [allowed to push changes to the
branch][gh-edit-branch]

[conv-commits]: https://www.conventionalcommits.org/en/v1.0.0/#summary
[gh-edit-branch]:
https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests

---------

Co-authored-by: Romain Ruetschi <romain.ruetschi@circle.com>
…P address

Closes: #XXX

## Summary
Adding a configurable `max_connections_per_ip` setting to limit the
number of inbound connections from a single IP address. This helps
prevent DoS attacks where an attacker generates many PeerIds from the
same IP.

## Changes

### Configuration

Added `max_connections_per_ip` to `DiscoveryConfig`:

```toml
[consensus.p2p.discovery]
# Maximum connections allowed per IP address.
# Limits inbound connections from the same IP to prevent DoS attacks
# where an attacker generates many PeerIds from the same IP.
# Defaults to num_inbound_peers (effectively disabled).
# max_connections_per_ip = 20
```

**Environment variable**:
`MALACHITE__CONSENSUS__P2P__DISCOVERY__MAX_CONNECTIONS_PER_IP`

### Default Behavior

- Defaults to `num_inbound_peers`, which effectively disables the per-IP
limit
- This ensures backward compatibility so existing config files work
without changes
- To enable the limit, set a value lower than `num_inbound_peers`

### Implementation

The enforcement is done in `crates/network/src/ip_limits.rs` via a
custom libp2p `NetworkBehaviour`:
- Tracks pending and established inbound connections per IP address
- Rejects new inbound connections from an IP when the limit is reached
- Only limits inbound connections, outbound connections are not tracked
or limited

### Tests

- Added `crates/network/test/tests/ip_limits.rs` with
`per_ip_connection_limit` test
- Before adding IP limits, in ab2d2ddf3bf361a93d03daa83a2d8ef29e400794,
test fails with:
```
2025-12-17T18:43:51.651163Z  INFO ip_limits: Connected peers: 5 (expected max: 2, attempted: 5)

thread 'same_ip_connection_attack' (22093699) panicked at crates/network/test/tests/ip_limits.rs:119:5:
Per-IP limit not enforced: 5 connections exceeds limit of 2
```

- After adding IP limits, in 94456495fdb009a7a758c12831d7d0b0f68fe670,
test passes:

```
  2025-12-17T20:16:21.732310Z  INFO ip_limits: Connected peers: 2 (expected max: 2, attempted: 5)
2025-12-17T20:16:21.732377Z  INFO ip_limits: ✓ Per-IP limit enforced: 2 <= 2
test same_ip_connection_attack ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.72s
```
- Also tested on local setup

Note: Updated `connection_flood.rs` test to explicitly set high IP limit
and avoid interference

## Notes

- Only **inbound** connections are limited; outbound connections are not
affected
- This is useful for preventing Sybil attacks from a single IP
- Validators behind NAT sharing the same public IP may need a higher
limit or explicit configuration
- similarly, in local setups that use same IP address for each node,
using a higher limit or default is recommended
…cle-chain-reth`

These changes are a follow up of circlefin#99, and are coming from addressing
some comments in circlefin/circle-chain-reth#768

Closes: #XXX

---

### PR author checklist

#### For all contributors

- [ ] Reference a GitHub issue
- [ ] Ensure the PR title follows the [conventional
commits][conv-commits] spec
- [ ] Add a release note in [`RELEASE_NOTES.md`](/RELEASE_NOTES.md) if
the change warrants it
- [ ] Add an entry in [`BREAKING_CHANGES.md`](/BREAKING_CHANGES.md) if
the change warrants it

#### For external contributors

- [ ] Maintainers of Malachite are [allowed to push changes to the
branch][gh-edit-branch]

[conv-commits]: https://www.conventionalcommits.org/en/v1.0.0/#summary
[gh-edit-branch]:
https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests
…nnection from same peer

Closes: #XXX

## Summary

When receiving an inbound connection from a peer, the old code would
incorrectly cancel all pending outgoing dials to that same peer.

The logic was:

```rust
if self.controller.dial.remove_in_progress(&connection_id).is_none() {
    // Intended as defensive cleanup, but triggered on every inbound connection
    self.controller.dial_remove_matching_in_progress_connections(&peer_id);
}
```

For inbound connections, `connection_id` is never in our dial tracking
(we didn't initiate the dial), so `remove_in_progress` always returns
`None`, triggering the outgoing dial cleanup. This casues:

- Lost dial tracking: outgoing dial attempts are removed from
in_progress tracking
- Broken retries: if an outgoing dial later fails, retry logic can't
find the dial data

Instead we should only clean up the specific `connection_id` that
completed. For inbound connections, `remove_in_progress` returns `None`
and we don't need to do anything.

---

### PR author checklist

#### For all contributors

- [ ] Reference a GitHub issue
- [ ] Ensure the PR title follows the [conventional
commits][conv-commits] spec
- [ ] Add a release note in [`RELEASE_NOTES.md`](/RELEASE_NOTES.md) if
the change warrants it
- [ ] Add an entry in [`BREAKING_CHANGES.md`](/BREAKING_CHANGES.md) if
the change warrants it

#### For external contributors

- [ ] Maintainers of Malachite are [allowed to push changes to the
branch][gh-edit-branch]

[conv-commits]: https://www.conventionalcommits.org/en/v1.0.0/#summary
[gh-edit-branch]:
https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests
…nabled

Closes: #XXX

## Summary

Prevent address poisoning when discovery is enabled

## Problem

[PR circlefin#1060](circlefin#1060)
("Connection upgrades and multiaddr issues") introduced a vulnerability
where all addresses from `dial_data.listen_addrs` are registered in
`done_on` when dialing a peer.

A malicious peer can exploit this by:
1. Claiming victim addresses in their peer exchange response
2. We register those addresses as "done" when processing the malicious
peer's info
3. Later, when real victim info arrives, their addresses are already in
`done_on`
4. Victim becomes unreachable

This only affects discovery-enabled nodes. When discovery is disabled,
peer info comes from trusted config only.

## Fix

For discovered peers (untrusted), only register `peer_id` in `done_on`,
not addresses:

Note:
Peer-ID poisoning is still possible: attacker sends fake
`(victim_peer_id, [garbage_addr])`. The victim's peer_id gets marked as
"done" before we realize the info was wrong.

This will be addressed in a future PR using signed peer records, where
peer info is cryptographically signed by the peer's private key,
preventing impersonation of other peer IDs.

## Testing

```
cargo test -p informalsystems-malachitebft-discovery -- controller::tests
```

Note: Unit tests verify the fix at the controller level. Proper
end-to-end tests (simulating a malicious node injecting poisoned peer
info, addresses or peer-id) require extending the network test framework
to support byzantine fault injection at the discovery/peer-exchange
layer. We will consider doing this in a future PR.

---

### PR author checklist

#### For all contributors

- [ ] Reference a GitHub issue
- [ ] Ensure the PR title follows the [conventional
commits][conv-commits] spec
- [ ] Add a release note in [`RELEASE_NOTES.md`](/RELEASE_NOTES.md) if
the change warrants it
- [ ] Add an entry in [`BREAKING_CHANGES.md`](/BREAKING_CHANGES.md) if
the change warrants it

#### For external contributors

- [ ] Maintainers of Malachite are [allowed to push changes to the
branch][gh-edit-branch]

[conv-commits]: https://www.conventionalcommits.org/en/v1.0.0/#summary
[gh-edit-branch]:
https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests
…atus ticker

The preliminary study on CCHAIN-491 (see Jira ticket) shows that the
sending of status messages are _clustered_ even after 20 mins of testnet
startup time.

Therefore, we need to do something to change that situation.

This PR contains the least invasive solution I could think of.
Let us consider the the _ticker_ that controls the sending of the
"Broadcasting status" messages.
Rather than applying some (random?) jitter every time the ticker sleeps
between two consecutive messages, we simply apply a **one-time**
adaptation to the ticker's period, which we then keep constant for the
rest of the process' lifetime.

The adjustment is random per-node: it's a uniform random distribution in
the interval [-ADJ, +ADJ].

TODO: Finish the measurements allowing to fine-tune the amount of the
adjustment ADJ.

Closes: #XXX

---

### PR author checklist

#### For all contributors

- [ ] Reference a GitHub issue
- [ ] Ensure the PR title follows the [conventional
commits][conv-commits] spec
- [ ] Add a release note in [`RELEASE_NOTES.md`](/RELEASE_NOTES.md) if
the change warrants it
- [ ] Add an entry in [`BREAKING_CHANGES.md`](/BREAKING_CHANGES.md) if
the change warrants it

#### For external contributors

- [ ] Maintainers of Malachite are [allowed to push changes to the
branch][gh-edit-branch]

[conv-commits]: https://www.conventionalcommits.org/en/v1.0.0/#summary
[gh-edit-branch]:
https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests

---------

Co-authored-by: Romain Ruetschi <romain.ruetschi@circle.com>
This PR prevents address spoofing by using actual connection addresses
instead of self-reported addresses from the identify protocol.

Problem

Previously, `is_persistent_peer_by_address` used addresses from
`identify::Info.listen_addrs`, which are self-reported by the remote
peer. A malicious node could claim to be listening on a persistent
peer's address.

Solution

Use the actual remote address from the established connection, which
cannot be spoofed:

- outbound connections: we initiate the dial to a known address, so the
remote address is what we dial
- inbound connections: the remote address is the actual source of the
connection (may include ephemeral port even with port reuse)

IMPORTANT
The proper long-term solution is to configure persistent peers by
PeerId rather than address. PeerId is cryptographically tied to the
peer's identity and cannot be spoofed without stealing the private key.

NOTE
For discovered peers, address spoofing at source is already handled -
we dial with `DialOpts::peer_id(peer_id)` which verifies the peer at the
address has the expected PeerId. Relay attacks (third party advertising
false addresses for a peer) are addressed by the peer records PR which
requires signed address records.

Persistent peer status

Persistent peer status is granted when the connection's remote address
matches a configured persistent peer address:
- Outbound: We dial the configured address, so match is guaranteed for
legitimate peers
- Inbound with port reuse: Legitimate peer's source port matches their
listen port, so match succeeds
- Inbound without port reuse: Ephemeral source port does not match
config, peer gets (re)classified as persistent on our subsequent
outbound dial to the configured listen address

Metric Duplicate Fix

The changes above exposed a latent bug where inbound connections
(ephemeral port) produce `is_persistent=false` while outbound
connections produce `is_persistent=true` for the same validator peer.
Both `PeerType` values have different `Hash` but produce the same
Prometheus label "validator", causing duplicate metric entries.
In `main` the classification happens immediately for peers that dial
first (inbound first).

Fixed by implementing `Hash` and `PartialEq` based on
`primary_type_str()` to match `EncodeLabelValue` behavior.
…on and retry logic

Prevent address spoofing vulnerabilities in bootstrap peer detection.

- `update_bootstrap_node_peer_id` uses self-reported
`identify::Info.listen_addrs` for inbound connections, allowing an
attacker to claim they are listening on a bootstrap node's address and
get their PeerId registered as that bootstrap node.

- `handle_failed_connection` uses `dial_data.listen_addrs()` (which
could come from discovery peer exchange) to determine if a failed dial
was to a bootstrap node, potentially granting unlimited retries to
malicious peers.

- In `update_bootstrap_node_peer_id`, for inbound connections we now use
the connection remote address from `self.connections` (TCP layer)
instead of self-reported identify addresses.

- for Outbound, use dial_data addresses (trusted since we initiated the
connection)
- for Inbound, use the remote address (trusted since it comes from the
TCP layer)

- Use explicit `is_bootstrap` flag for retry logic. Added `is_bootstrap`
field to `DialData` that is set at creation time when dialing configured
bootstrap nodes. `handle_failed_connection` now checks this flag instead
of comparing addresses.

---

- [ ] Reference a GitHub issue
- [ ] Ensure the PR title follows the [conventional
commits][conv-commits] spec
- [ ] Add a release note in [`RELEASE_NOTES.md`](/RELEASE_NOTES.md) if
the change warrants it
- [ ] Add an entry in [`BREAKING_CHANGES.md`](/BREAKING_CHANGES.md) if
the change warrants it

- [ ] Maintainers of Malachite are [allowed to push changes to the
branch][gh-edit-branch]

[conv-commits]: https://www.conventionalcommits.org/en/v1.0.0/#summary
[gh-edit-branch]:
https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests
@romac romac requested a review from ancazamfir as a code owner January 20, 2026 17:38
ams9198
ams9198 previously approved these changes Jan 20, 2026
cason and others added 3 commits January 22, 2026 08:55
Test around 500k values for the validator set's total power, and
verifies whether the 1/3+ and 2/3+ thresholds are computed properly on a
range of numbers around (total power/3). We verify the correctness of
the threshold computations using different arithmetics, in particular
for 2/3+, see details in code comments.

For reviewers: the first commit allows for longer (~90s) pseudo-random
tests.
romac and others added 2 commits January 22, 2026 09:07
…equiring signed proposal

Summary

When a node syncs a value from another peer, it receives a commit
certificate and the proposed value. Previously, the consensus layer
required a signed proposal to complete the decision flow, which meant we
had to sign a proposal just to satisfy this requirement. This was
wasteful (need to generate signatures for every single sync height) and
unnecessary since the commit certificate already proves the value was
decided by the network.

This PR introduces a new `DecideOnSync` input path that allows
sync-originated values to trigger consensus decisions without requiring
a signed proposal.

Changes

The sync decision path now works as follows:

1. The commit certificate is verified and stored when the sync response
is received, same as before
2. When the proposed value arrives with sync origin, an unsigned
proposal is created internally
3. The driver processes `DecideOnSync` which routes the proposal through
the state machine using `ProposalAndPrecommitValue`
4. The `decide()` function detects sync decisions by checking for a
pre-existing commit certificate and skips the signed proposal lookups

The presence of a commit certificate before `decide()` is called serves
as the indicator that this is a sync decision, since in normal consensus
the certificate is built from precommits inside `decide()` itself.

---------

Co-authored-by: Daniel <daniel.cason@circle.com>
Co-authored-by: Romain Ruetschi <romain.ruetschi@circle.com>
@romac romac enabled auto-merge January 22, 2026 14:45
Move the inline code in the `Input::ValueResponse` match case to a
dedicated function.
…n peer exchange (circlefin#105)

Closes: #XXX

## Summary
The discovery peer exchange protocol is vulnerable to peer ID spoofing.
Use signed peer records instead.

## Problem
A malicious node could advertise fake `(peer_id, addresses)` tuples,
causing honest nodes to:

- Associate wrong addresses with legitimate peer IDs
- Waste resources dialing unreachable addresses
- Mark legitimate peers as "done" before reaching them

## Solution

Use libp2p's signed peer records for peer exchange in discovery. Each
peer signs their addresses with their private key, cryptographically
binding peer ID to addresses.

Note: This PR does not address spoofing for peer exchange. A peer can
still sign addresses they don't control. Dial attempts to fake addresses
fail naturally (`WrongPeerId` or timeout). Combined with peer scoring,
repeat offenders could be penalized. This will be addressed in a
separate PR.

## Notes on upgrade
This in principle is a breaking change as the wire format is
incompatible with the previous unsigned peer exchange.
The recommended procedure:

1. Disable discovery on all nodes (`discovery.enabled = false`) - This
is already the case in testnet, discovery was never enabled
2. Rolling upgrade all nodes to the new version
3. Enable discovery only after all nodes are upgraded

---

### PR author checklist

#### For all contributors

- [ ] Reference a GitHub issue
- [ ] Ensure the PR title follows the [conventional
commits][conv-commits] spec
- [ ] Add a release note in [`RELEASE_NOTES.md`](/RELEASE_NOTES.md) if
the change warrants it
- [ ] Add an entry in [`BREAKING_CHANGES.md`](/BREAKING_CHANGES.md) if
the change warrants it

#### For external contributors

- [ ] Maintainers of Malachite are [allowed to push changes to the
branch][gh-edit-branch]

[conv-commits]: https://www.conventionalcommits.org/en/v1.0.0/#summary
[gh-edit-branch]:
https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests
@circlefin circlefin deleted a comment from ancazamfir Jan 22, 2026
Copy link
Contributor

@ancazamfir ancazamfir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🥳

@romac romac merged commit 2b4a3c6 into circlefin:main Jan 22, 2026
15 of 16 checks passed
@romac romac deleted the public-upstream branch January 22, 2026 17:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: Upstream Arc-related changes

5 participants