Skip to content

Fix slot-range parsing during migration and add rebalance planning logic#92

Open
ysqyang wants to merge 1 commit intovalkey-io:mainfrom
ysqyang:cluster-state-classification
Open

Fix slot-range parsing during migration and add rebalance planning logic#92
ysqyang wants to merge 1 commit intovalkey-io:mainfrom
ysqyang:cluster-state-classification

Conversation

@ysqyang
Copy link
Contributor

@ysqyang ysqyang commented Feb 24, 2026

Summary

  • Fix parseSlotsRanges to skip [slot->-nodeid] / [slot-<-nodeid] entries that Valkey appends to CLUSTER NODES output during active slot migrations, which previously caused parse errors.
  • Add BuildRebalanceMove — a deterministic, incremental slot rebalance planner that computes a single slot migration from an overloaded primary to an underloaded (or empty) one. This is the planning layer for scale-out support; execution is wired in a follow-up PR.

Test plan

  • Unit test for parseSlotsRanges: normal ranges, migrating/importing entries, empty input, migration-only input
  • Unit tests for BuildRebalanceMove: scale-out (2→3 shards), balanced cluster (no-op), mismatched shard count, extra empty shards, zero max-slots, slot extraction from ranges

@ysqyang ysqyang force-pushed the cluster-state-classification branch from 34325d0 to 78cb392 Compare February 24, 2026 22:13
@ysqyang ysqyang changed the title Refine node classification in GetClusterState for scale-out Add slot-parsing fix and rebalance planning logic Feb 24, 2026
@ysqyang ysqyang force-pushed the cluster-state-classification branch 4 times, most recently from ede2f24 to e728115 Compare February 25, 2026 07:40
@ysqyang ysqyang changed the title Add slot-parsing fix and rebalance planning logic Fix slot-range parsing during migration and add rebalance planning logic Feb 25, 2026
@ysqyang ysqyang marked this pull request as ready for review February 25, 2026 19:06
@ysqyang ysqyang force-pushed the cluster-state-classification branch from e728115 to 7c7182f Compare February 25, 2026 22:02
Two independent but related changes for scale-out support:

1. Skip migrating/importing entries (e.g. "[5461->-abc123]") in
   parseSlotsRanges to prevent parse errors when CLUSTER NODES output
   contains in-progress slot migrations. Adds unit tests.

2. Introduce BuildRebalanceMove which computes a single, deterministic
   slot move to incrementally rebalance a cluster after scale-out.
   Calculates per-primary slot targets (16384 / shards), identifies the
   most-loaded source and least-loaded destination, and returns a
   bounded SlotMove. Returns one move per call so each reconcile loop
   stays fast and restartable. Includes unit tests.

Signed-off-by: yang.qiu <yang.qiu@reddit.com>
@ysqyang ysqyang force-pushed the cluster-state-classification branch from 7c7182f to 82a27d2 Compare February 25, 2026 23:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant