-
-
Notifications
You must be signed in to change notification settings - Fork 20
Description
Problem
At large scale (100k–1M peers), targeted pulls (blocks, RPC-like requests) must not fall back to flooding the overlay.
Today DirectBlock/RemoteBlocks can end up using AnyWhere() when blocks.get(cid, { remote: true }) has no bounded remote.from candidate list. That is fine for small tests but becomes unscalable.
At the same time, we already have a scalable control-plane pattern in FanoutTree (bootstrap trackers + ANNOUNCE/QUERY/REPLY + candidate scoring/TTL).
Goal
Reuse the same tracker/control-plane substrate (bootstraps, trackers, scoring, TTL, capacity hints) to provide provider discovery for targeted fetches:
- blocks (content-addressed fetch)
- RPC-like “ask a small set of peers for X”
…without using the FanoutTree data-plane (tree broadcast) for lookups.
Proposal (high-level)
Implement a small ProviderDirectory (or similar) that:
- lets providers/replicators announce availability for a namespace (e.g. program/topic + shard/range)
- lets clients query for K candidates and returns a bounded list suitable for
remote.from - prefers already-connected peers first (parent/children/mesh) and only then consults trackers
This should integrate with:
DirectBlock/RemoteBlocks(populateremote.frombefore requesting)- higher-level components that already compute candidate sets (e.g. SharedLog
getCover, document store query planning)
Work
- Define a minimal key space for discovery (namespace + optional shard/range; avoid per-CID announcements)
- Implement tracker messages/handlers (can live next to FanoutTree tracker protocol or as a small sibling service)
- Add a
queryProviders(...) -> string[]API that returns bounded candidates (K) - Wire
RemoteBlockssoremote: truedoes not flood by default:- if
remote.fromprovided → use it - else try ProviderDirectory → use returned candidates
- only flood when explicitly enabled (debug/sim)
- if
- Add sim/bench coverage: assert that a fetch at N=5k uses O(K) requests, not O(N)
Acceptance
- A new sim/benchmark demonstrates bounded discovery traffic at scale.
-
blocks.get(..., { remote: true })does not sendAnyWhere()floods in production/default configuration. - Clear docs: when to use FanoutTree broadcast vs provider-discovered targeted pull.
Related
- Scalable fanout pubsub over @peerbit/stream (tree + pull-repair + incentives) #577 scalable fanout pubsub
- FanoutTree bootstrap tracker hardening (limits, scoring, rate limiting) #581 tracker hardening