-
Notifications
You must be signed in to change notification settings - Fork 9
Description
In range partition mode, range size needs to be tracked in real time in order to trigger range splits promptly.
To achieve this feature, two range sizes variables are maintained:
-
1. Maintain two related variables:
1.1. the in-memory range size (real-time value for split decisions)
1.2. the persisted range size (durable value for rebuilding in-memory stats on restart or ccmap initialization).
-
2. Maintaining range sizes in memory:
Keep in-memory range sizes accurate across five key paths: PostWriteCc, ccmap initialization, range split, log replay, and Create SK.
-
2.1. PostWriteCc
1) After each write, compute the size delta for the key’s range and add it to that range’s in-memory size.
2) If the current range size has not been initialized, then perform the load range size operation to initialize the range size. -
2.2. Ccmap Initialization (including restart)
When a ccmap first executes a write request, lazily initializerange_sizes_from persisted range sizes. -
2.3. Range Split
After a range split, initialize each new range’s size from the summed size of its slices while preserving the old range’s history.
1) For the post write operation in the range split process (i.e. the "double write" phase), the change in range size is recorded in the delta size. At the same time, the delta size of the key belonging to the new range is not recorded in the delta size of the old range.
1) During the post-commit phase of range splitting, the range size of all subranges is reset, i.e., persisted range size + delta size. -
2.4. Log Replay
This includes two aspects:
1) Data log replay
Use persisted range sizes as the baseline, apply conservative updates for replayed keys with unknown size, and route by range id instead of key hash.
2) Range split log replay -
2.5. Create SK(UploadBatchCc)
With the target range known, sendUploadBatchCconly to the target shard and update that range’s size when writing SKs.
Update range sizes of this ccmap. -
3. Maintaining range sizes in storage:
3.1. On the write side, aggregate slice sizes per range during checkpoint or range-slice updates and persist them;
3.2. On the read side, load these persisted values when initializing ccmap or after restart to rebuild in-memory range sizes.