Skip to content

Harden shred reconstruction: dedup, bounded recovery queue, and robust deshred/FEC handling#65

Draft
esemeniuc wants to merge 5 commits intomasterfrom
eric/deshred-fix
Draft

Harden shred reconstruction: dedup, bounded recovery queue, and robust deshred/FEC handling#65
esemeniuc wants to merge 5 commits intomasterfrom
eric/deshred-fix

Conversation

@esemeniuc
Copy link
Collaborator

@esemeniuc esemeniuc commented Feb 8, 2026

This PR hardens the shred forward/reconstruct path to improve correctness under duplicate/conflicting traffic and reduce backlog/memory risk. Added ~2k lines of tests to match shred spec.

Changes

  • Reworked deshred into explicit ingest -> recover -> decode -> evict phases, with FEC state keyed by (fec_set_index, version, leader_signature) so conflicting variants are isolated.
  • Added ingress guards: dominant-version filtering, median-slot anchor with configurable slot window (--reconstruct-slot-lookback, --reconstruct-slot-future), and canonical payload-prefix parsing that ignores trailing packet bytes.
  • Improved conflict handling for data/coding duplicates, recovery retry gating based on FEC state changes, and unknown-start decode flow (emit best-effort once, commit later when real boundary arrives).
  • Added reconstruction backpressure/observability: larger bounded queue, batch draining, drop-on-full accounting, queue high-watermark metrics, and richer deshred/FEC metrics (including invalid Merkle root and entry sanity errors).
  • Updated forwarding dedup to hash canonical shred bytes (including ignoring retransmitter signatures for resigned Merkle shreds), skip forwarding discarded packets, and fix partial-send success/failure accounting.
  • Operational polish: heartbeat now advertises the actual bound UDP port, public IP parsing is safer, gRPC entry streaming handles lagged receivers, and examples/deshred.rs now decodes with fixint + trailing-byte-tolerant bincode options.
  • Expanded regression/spec tests across forwarder and deshred for version filtering, slot-poisoning resilience, trailing-byte behavior, duplicate replacement, unknown-start handling, and recovery correctness.

@esemeniuc esemeniuc force-pushed the eric/deshred-fix branch 2 times, most recently from 62722a6 to f581126 Compare February 21, 2026 19:11
@esemeniuc esemeniuc changed the title Eric/deshred fix Harden shred reconstruction: dedup, bounded recovery queue, and robust deshred/FEC handling Feb 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant