Skip to content

Conversation

@jaydeepkumar1984
Copy link
Owner

@jaydeepkumar1984 jaydeepkumar1984 commented Mar 25, 2025

Need to bootstrap a new node with the following configuration

auto_repair:
  repair_check_interval: "1m"
  enabled: true
  repair_type_overrides:
    bootstrap:
      enabled: true
      initial_scheduler_delay: 1m

The bootstrapping node does the following:

  • Just before streaming, it runs one round of repair on the token ranges for the node being replaced
  • Resumes with the regular bootstrap flow

This is POC code only https://issues.apache.org/jira/browse/CASSANDRA-20281

jaydeepkumar1984 and others added 30 commits March 3, 2025 15:43
tolbertam and others added 25 commits March 9, 2025 11:09
Co-authored-by: Francisco Guerrero <frank.guerrero@gmail.com>
Co-authored-by: Francisco Guerrero <frank.guerrero@gmail.com>
Co-authored-by: Francisco Guerrero <frank.guerrero@gmail.com>
…geSplitter.java

Co-authored-by: Francisco Guerrero <frank.guerrero@gmail.com>
…geSplitter.java

Co-authored-by: Francisco Guerrero <frank.guerrero@gmail.com>
…0419

Avoid micros splitting of token ranges for FixedSplitTokenRangeSplitter
Auto Repair Documentation improvements, bytes_per_assignment to 50GiB
Move auto-repair retry policy config to repair-type level
Previously AutoRepairUtils.myTurnToRunRepair would return MY_TURN
if there is capacity to select a new node for repair and the current
node has the oldest lastRepairFinishTime.

This change adjusts this method to make use of a newly added method
getMostEligibleHostToRepair which will identify the node with the
oldest lastRepairFinishTime of which none of the node's replicas
are actively repairing.

This should prevent replicas from actively repairing concurrently,
which would increase load on replica sets and also create anticompaction
conflicts for incremental repair.

Behavior is configurable by allow_parallel_replica_repair and
allow_parallel_replica_repair_across_schedules configuration.

patch by Andy Tolbert, reviewed by Jaydeepkumar Chovatia and Francisco Guerrero for CASSANDRA-20180
Avoid returning MY_TURN when replicas are actively taking their turn
This reverts commit 74e63d2.
@jaydeepkumar1984 jaydeepkumar1984 force-pushed the trunk_cep_37_CASSANDRA_20281_2 branch from e03c902 to fc5635b Compare March 25, 2025 19:02
@jaydeepkumar1984 jaydeepkumar1984 force-pushed the trunk_cep_37_CASSANDRA_20281_2 branch from fc5635b to 7f4b241 Compare March 25, 2025 20:36
@kzalys
Copy link

kzalys commented Apr 17, 2025

  1. Token split
  2. Observability/progress (generally applies to other repair types as well)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants