From 5461d4635feb17dc6496e80f7246dddf35975bcc Mon Sep 17 00:00:00 2001 From: Alexander Laye Date: Mon, 8 Dec 2025 16:43:15 -0500 Subject: [PATCH 1/3] add baseline doc --- docs/designs/hub-controller-design.md | 60 +++++++++++++++++++++++++++ 1 file changed, 60 insertions(+) create mode 100644 docs/designs/hub-controller-design.md diff --git a/docs/designs/hub-controller-design.md b/docs/designs/hub-controller-design.md new file mode 100644 index 00000000..6ac57bc7 --- /dev/null +++ b/docs/designs/hub-controller-design.md @@ -0,0 +1,60 @@ +# Multi-Cluster Hub Controller Design + +## Problem + +* There is no simple way for the promotion token from a demoted cluster to +transfer to the newly promoted cluster +* There needs to be a central location where Azure DNS can be managed + +## Implementation + +This will be a separate k8s operator running in the KubeFleet hub, +It will try to remain as minimal as possible. + +### Promotion token management + +The Controller will be able to query endpoints on the member clusters +with the promotion token, and then create a configMap and CRP to +send that token to the new primary cluster. It will have access to the +documentdb crp so it will be able to see which member is primary. + +It will clean up the token and crp when the promotion is complete. +It can determine this through another documentdb operator endpoint. + +### DNS Management + +If requested in the documentdb object, the controller should also +provision and manage an Azure DNS zone for the documentdb cluster. +This will create an SRV that points to the primary for seamless +client-side failover, as well as individual DNS entries for each +cluster individually. + +This will need the following information +* Azure Resource group +* Azure Subscription +* DNS Zone name (optional, could be generated on the fly) +* Parent DNS Zone (optional) + * Parent DNS Zone RG and Subscription + +## Other possible additions + +### Streamlined Operator and Cluster deployment + +This new conrtoller could theoretically handle the installation and +distribution of the cert manager and the operator to save the user from +having to deploy a large and cumbersome CRP. It could also monitor +the DocumentDB CRD and automatically create a CRP for that matching +the provided clusterReplication field. + +## Security considerations + +This operator will have no more access than the fleet manager already +does, and the member cluster operator endpoints will be limited to the +least amount of information provided possible and only grant access +to the fleet controller. + +## Alternatives + +Currently, we perform this promotion token transfer using a nginx pod +and a multi-cluster service when using KubeFleet. The DNS zone creation +and management is handled by the creation and failover scripts. From 061c4eea09eab6380161f46ac0d84dcb466fc4ee Mon Sep 17 00:00:00 2001 From: Alexander Laye Date: Tue, 9 Dec 2025 11:31:55 -0500 Subject: [PATCH 2/3] Add unmanaged failover section Signed-off-by: Alexander Laye --- docs/designs/hub-controller-design.md | 60 +++++++++++++++++++++------ 1 file changed, 48 insertions(+), 12 deletions(-) diff --git a/docs/designs/hub-controller-design.md b/docs/designs/hub-controller-design.md index 6ac57bc7..b74ff4c3 100644 --- a/docs/designs/hub-controller-design.md +++ b/docs/designs/hub-controller-design.md @@ -2,9 +2,10 @@ ## Problem -* There is no simple way for the promotion token from a demoted cluster to +* There is no simple way for the promotion token from a demoted cluster to transfer to the newly promoted cluster * There needs to be a central location where Azure DNS can be managed +* We need some way to initiate failover without manual intervention ## Implementation @@ -15,42 +16,73 @@ It will try to remain as minimal as possible. The Controller will be able to query endpoints on the member clusters with the promotion token, and then create a configMap and CRP to -send that token to the new primary cluster. It will have access to the -documentdb crp so it will be able to see which member is primary. +send that token to the new primary cluster. It will have access to the +documentdb crp so it will be able to see which member is primary. -It will clean up the token and crp when the promotion is complete. +It will clean up the token and crp when the promotion is complete. It can determine this through another documentdb operator endpoint. ### DNS Management If requested in the documentdb object, the controller should also provision and manage an Azure DNS zone for the documentdb cluster. -This will create an SRV that points to the primary for seamless -client-side failover, as well as individual DNS entries for each -cluster individually. +This will create an SRV DNS entry that points to the primary for +seamless client-side failover, as well as individual DNS entries +for each cluster. This will need the following information -* Azure Resource group + +* Azure Resource group * Azure Subscription * DNS Zone name (optional, could be generated on the fly) +* Azure credentials * Parent DNS Zone (optional) - * Parent DNS Zone RG and Subscription + * Parent DNS Zone RG and Subscription + +### Automatic failover + +The operator will have a health check endpoint that the controller can +periodically query to determine liveness for failover. There will be a +setting for how long a primary cluster will be marked down before failover is +initiated. + +The health check endpoint should provide the controller with a LSN for the +database so that it can have an up to date list + +When that time limit is hit, the operator should use the LSNs that it knows +to pick a promotion candidate and alter the DocumentDB object so the operators +know to run the promotion process. ## Other possible additions ### Streamlined Operator and Cluster deployment -This new conrtoller could theoretically handle the installation and +This new controller could theoretically handle the installation and distribution of the cert manager and the operator to save the user from -having to deploy a large and cumbersome CRP. It could also monitor +having to deploy a large and cumbersome CRP. It could also monitor the DocumentDB CRD and automatically create a CRP for that matching the provided clusterReplication field. +### Pluggable DNS management + +The DNS management could be abstracted to allow for other cloud's +DNS management systems. The current implementation will create an +API that will extensible. + +## Updates + +Updates of the operators will be coordinated through KubeFleet's +ClusterStagedUpdateStrategy. This will allow the operators to safely +update with optional rollbacks. The controller itself should be able to +be updated independently of the operators. Steps will be taken to ensure +backwards compatibility through the use of things like feature flags and +deprecating but maintaining old APIs. + ## Security considerations This operator will have no more access than the fleet manager already does, and the member cluster operator endpoints will be limited to the -least amount of information provided possible and only grant access +least amount of information provided possible and only grant access to the fleet controller. ## Alternatives @@ -58,3 +90,7 @@ to the fleet controller. Currently, we perform this promotion token transfer using a nginx pod and a multi-cluster service when using KubeFleet. The DNS zone creation and management is handled by the creation and failover scripts. + +## References + +* [KubeFleet Staged Update](https://kubefleet.dev/docs/how-tos/staged-update/) From e9e439fca126f55db92596f6b13852f538bba109 Mon Sep 17 00:00:00 2001 From: Alexander Laye Date: Mon, 15 Dec 2025 11:00:02 -0500 Subject: [PATCH 3/3] remove auto-failover language --- docs/designs/hub-controller-design.md | 33 ++++++++++++--------------- 1 file changed, 15 insertions(+), 18 deletions(-) diff --git a/docs/designs/hub-controller-design.md b/docs/designs/hub-controller-design.md index b74ff4c3..02fdc252 100644 --- a/docs/designs/hub-controller-design.md +++ b/docs/designs/hub-controller-design.md @@ -5,7 +5,7 @@ * There is no simple way for the promotion token from a demoted cluster to transfer to the newly promoted cluster * There needs to be a central location where Azure DNS can be managed -* We need some way to initiate failover without manual intervention +* We need some way to manage the failover of many DocumentDB instances at once ## Implementation @@ -14,13 +14,13 @@ It will try to remain as minimal as possible. ### Promotion token management -The Controller will be able to query endpoints on the member clusters -with the promotion token, and then create a configMap and CRP to -send that token to the new primary cluster. It will have access to the -documentdb crp so it will be able to see which member is primary. +The Controller will be able to query the Kube API on the member clusters to +get the promotion token from the Cluster CRD. Then it will create a configMap +and CRP to send that token to the new primary cluster. It will use the +documentdb crd to determine which member is primary. It will clean up the token and crp when the promotion is complete. -It can determine this through another documentdb operator endpoint. +It can determine this through the Cluster CRD status. ### DNS Management @@ -39,19 +39,16 @@ This will need the following information * Parent DNS Zone (optional) * Parent DNS Zone RG and Subscription -### Automatic failover +### Regional Failover -The operator will have a health check endpoint that the controller can -periodically query to determine liveness for failover. There will be a -setting for how long a primary cluster will be marked down before failover is -initiated. - -The health check endpoint should provide the controller with a LSN for the -database so that it can have an up to date list - -When that time limit is hit, the operator should use the LSNs that it knows -to pick a promotion candidate and alter the DocumentDB object so the operators -know to run the promotion process. +The user should be able to initiate a regional failover, wherein all clusters in +a region change their primary. The controller should know the LSNs on each +instance, and pick the highest for each cluster to become the new primary. To +initiate this failover, the user should create a CRD that marks a particular +member cluster as not primary-ready. The controller will watch this resource, +and use that information to update each DocumentDB instance. The crp will +automatically push those changes, and the Operators will perform the actual +promotions and demotions ## Other possible additions