Skip to content

Scalable fanout pubsub over @peerbit/stream (tree + pull-repair + incentives) #577

@marcus-pousette

Description

@marcus-pousette

Goal

Ship a pubsub fanout solution that can support very large audiences (target: 1 publisher → 1,000,000 subscribers) with bounded per-node upload, low latency, and measurable reliability — without requiring global membership knowledge.

Tracking PR: #582

Motivation

Current pubsub subscriber discovery can “explode” because the control-plane scales poorly (subscription gossip amplification). At large scale, the system must:

  • avoid to=[all subscribers]
  • avoid global ACKs
  • avoid any per-message work that grows with total subscribers

Status (living tracker)

Implemented (WIP, not merged)

  • Local sims to stress real @peerbit/stream data plane:
    • packages/transport/pubsub/benchmark/pubsub-topic-sim.ts
    • packages/transport/pubsub/benchmark/pubsub-tree-sim.ts
  • Experimental production building block: FanoutTree + FanoutChannel
    • Tree push + pull repair window
    • Stable message ids for stream-level dedup (FOUT + seq + channelKeyPrefix)
  • Join/bootstrapping via bootstrap servers (tracker)
    • relays announce capacity to bootstrap nodes (TRACKER_ANNOUNCE)
    • joiners query bootstrap nodes for candidate parents (TRACKER_QUERY/TRACKER_REPLY), then dial + normal JOIN_REQ
    • Peerbit.bootstrap() also configures peer.services.fanout.setBootstraps(...)
    • Fix: avoid advertising invalid /p2p/<peerId> multiaddrs for simulated peers
  • End-to-end FanoutTree sim/bench (fanout-tree-sim) with --timeoutMs + --assert* gating (FanoutTree sim/bench (end-to-end) with timeouts + CI thresholds #578)
  • CI-friendly sims + nightly scale runs
    • packages/transport/pubsub/test/fanout-tree-sim.spec.ts
    • .github/workflows/nightly-sims.yml
  • Real-protocol interactive sandbox + article (browser)
    • apps/peerbit-org/src/ui/FanoutProtocolSandbox.tsx
    • docs/blog/2026-02-02-interactive-fanout-visualizer.md

Next (proposed order)

Requirements

Functional

  • 1 publisher can publish at ~30 msg/s (configurable), and subscribers receive messages with bounded latency.
  • Delivery works without global knowledge of subscribers.
  • Nodes can configure upload limits and will not exceed them (best-effort within simulation and later production).
  • Nodes can express relay preferences (e.g. “only relay if compensated / bid-based selection”).

Reliability

  • Define explicit delivery goals per workload, e.g.:
    • “live”: > 99% delivered within deadline under mild churn/loss
    • “reliable”: > 99.9% delivered eventually with bounded overhead
  • Repair must be bounded and local (neighbors/parent only).

Economics / incentives

  • Simulate “relay earnings” based on forwarded bytes.
  • Define a future-proof interface for bids/quotes (even if settlement is out-of-scope initially).

Engineering

  • Provide a deterministic, local simulation harness to test 1k–10k nodes on one machine:
    • measure delivery ratio, p50/p95/p99 latency, bandwidth overhead, queue/backpressure, and “earnings” distribution.
  • Add CI-friendly “small sim” tests that assert invariant thresholds.

Proposed approach (high-level)

Adopt a Plumtree-inspired architecture:

  • Tree push as the steady-state data plane (economical bandwidth).
  • Local pull repair (and/or gossip summaries) as the reliability layer.
  • Capacity-aware admission control: each relay accepts children within an upload budget and may prefer higher bids.

Acceptance criteria (simulation)

For a configurable workload (e.g. 2k nodes, 30 msg/s, 10s, 1KB):

  • Connected subscribers ≥ 99% (or defined threshold).
  • Delivered ≥ 99% (or defined threshold).
  • Overhead factor ≤ X (defined per reliability mode).
  • No node exceeds upload cap by more than Y% (or best-effort with explicit backpressure/drop policy).

Repo notes

A living spec is tracked in:

  • docs/scalable-fanout.md
  • docs/fanout-tree-protocol.md

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions