Context
From adversarial review of v0.4.0b1 (W11).
Problem
Records ingested via the backfill path (ingestion/backfill.py) do not emit change events to the ChangeStream. Any client subscribed to subscribeChanges will miss records that arrive through backfill.
Considerations
Arguments for publishing backfill events:
- Subscribers get a complete view of all data changes regardless of ingestion path
- Simplifies client logic — no need for separate backfill awareness
Arguments against:
- Backfill is historical data, not real-time changes — semantically different
- Backfill can produce thousands of events in rapid succession, overwhelming subscriber queues and triggering backpressure disconnects
- Clients that care about historical completeness should use query endpoints, not the live stream
Decision needed
- Should backfill publish to the change stream?
- If yes, should events be tagged with a
source: "backfill" field so clients can filter?
- If no, should this be documented explicitly in the
subscribeChanges endpoint docs?
Context
From adversarial review of v0.4.0b1 (W11).
Problem
Records ingested via the backfill path (
ingestion/backfill.py) do not emit change events to theChangeStream. Any client subscribed tosubscribeChangeswill miss records that arrive through backfill.Considerations
Arguments for publishing backfill events:
Arguments against:
Decision needed
source: "backfill"field so clients can filter?subscribeChangesendpoint docs?