-
Notifications
You must be signed in to change notification settings - Fork 15
MGMT-22237: Implement the required changes for IP Configuration flow #730
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MGMT-22237: Implement the required changes for IP Configuration flow #730
Conversation
|
@danmanor: This pull request references MGMT-22237 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.21.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
WalkthroughThis pull request extends Kubernetes Pod support in the etcd encoding layer and implements IP address migration for Pod status fields during cluster IP reconfiguration, including minion resource cleanup and expanded YAML file scanning for configuration updates. Changes
Sequence DiagramsequenceDiagram
participant Orchestrator as IP Rename<br/>Orchestrator
participant EtcdRename as etcd_rename<br/>Module
participant Etcd as In-Memory<br/>Etcd
Orchestrator->>EtcdRename: fix_pods_status(etcd_client,<br/>original_ip, new_ip)
Note over EtcdRename: List all Pod resources
EtcdRename->>Etcd: Read Pods
Note over EtcdRename: For each Pod (in parallel):<br/>- Check status IPs match original_ip<br/>- Update podIP, hostIP, podIPs, hostIPs
EtcdRename->>Etcd: Write updated Pod resources
Orchestrator->>EtcdRename: delete_minions_if_exist(etcd_client)
Note over EtcdRename: List minion keys
EtcdRename->>Etcd: Query for minions
alt Minions exist
EtcdRename->>Etcd: Delete minion entries
end
EtcdRename-->>Orchestrator: Complete
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Suggested labels
Suggested reviewers
Pre-merge checks and finishing touches❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: danmanor The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
@danmanor: This pull request references MGMT-22237 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.21.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
f3c7183 to
8518d0a
Compare
|
@danmanor: This pull request references MGMT-22237 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.21.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
src/ocp_postprocess/ip_rename/filesystem_rename.rs (1)
11-17: Extending filesystem IP fixes to quorum-restore pod manifests is reasonableIncluding
**/restore-etcd-pod/quorum-restore-pod.yamlin the glob chain keeps the behavior consistent with the other etcd pod artifacts and ensures restore-related pod YAMLs get their IPs updated as well.src/ocp_postprocess/ip_rename.rs (1)
148-160: Sequencing pod status updates and minion cleanup after etcd member update makes senseCalling
fix_pods_statusafterfix_etcd_memberensures Pod status fields reflect the new IPs, anddelete_minions_if_existthen drops the old node objects so they can be regenerated. In the dual‑stack path this runs once per IP pair and remains idempotent (second and subsequent minion deletions see no keys), so the flow is consistent for both single‑ and dual‑stack SNO IP changes.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
src/etcd_encoding.rs(4 hunks)src/ocp_postprocess/ip_rename.rs(1 hunks)src/ocp_postprocess/ip_rename/etcd_rename.rs(1 hunks)src/ocp_postprocess/ip_rename/filesystem_rename.rs(1 hunks)telco5g-konflux(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
src/ocp_postprocess/ip_rename.rs (1)
src/ocp_postprocess/ip_rename/etcd_rename.rs (3)
fix_pods_status(600-699)original_ip(569-569)delete_minions_if_exist(701-706)
src/ocp_postprocess/ip_rename/filesystem_rename.rs (2)
src/ocp_postprocess/cluster_domain_rename/filesystem_rename.rs (1)
file_utils(230-235)src/file_utils.rs (1)
globvec(26-42)
src/ocp_postprocess/ip_rename/etcd_rename.rs (1)
src/k8s_etcd.rs (3)
key(303-303)get_etcd_json(316-327)put_etcd_yaml(329-334)
🪛 GitHub Actions: Rust
src/ocp_postprocess/ip_rename/etcd_rename.rs
[error] 667-700: Rust formatting check failed. Run 'cargo fmt' to format code as required. Differences detected in the block around joining etcd client keys and subsequent processing.
[error] 1-1: Command failed with exit code 2.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Red Hat Konflux / recert-4-21-on-pull-request
🔇 Additional comments (3)
telco5g-konflux (1)
1-1: ---Verify the submodule commit contains the expected changes.
The submodule pointer has been updated to a new commit (
dd42e3c0b63a6f9231c0d117b9c915c0ace1e71b). Based on the PR objectives and AI summary, this commit should include Pod support in the etcd encoding, pod status IP address updates, minion cleanup logic, and filesystem YAML pattern expansions for the IP Configuration flow.Verify manually that the referenced commit aligns with these expectations and contains no unintended changes.
src/etcd_encoding.rs (1)
8-9: Pod support in etcd (de)serialization looks consistent with existing patternsImporting
Pod, definingPodWithMeta, and wiring the"Pod"arms intodecode/encodemirror how other core types (ConfigMap/Node/Secret) are handled, so Pod resources should now round‑trip correctly throughUnknownwithout affecting unknown kinds behavior.Also applies to: 61-61, 82-97, 111-128
src/ocp_postprocess/ip_rename/etcd_rename.rs (1)
600-706: Code verified; no outstanding issues remainThe
fix_pods_statusanddelete_minions_if_existfunctions are correctly implemented:
- Status IP mutation targets only known Pod fields (
podIP,hostIP,podIPs,hostIPs) with exact string matching, safely handling both string and object array entries.- Concurrent mutation via
join_allis consistent with the codebase pattern, writing back only on actual changes.delete_minions_if_existmirrors existing prefix-based deletion and is idempotent.Code formatting is clean with no trailing whitespace or indentation issues detected. All logic and structure are sound.
|
@danmanor: This pull request references MGMT-22237 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.21.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
src/ocp_postprocess/ip_rename/etcd_rename.rs (1)
695-700: Consider optimizing for dual-stack scenarios.The function correctly deletes all minion keys to allow node regeneration. However, in dual-stack mode, this function is called once per IP pair. The second call becomes a no-op since the first call already deleted the keys.
To optimize, consider calling
delete_minions_if_existonce after all IP pairs are processed infix_etcd_resources_dual_stack(in ip_rename.rs), rather than within eachfix_etcd_resources_for_ip_paircall. The current implementation is correct and safe, just not optimally efficient.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
src/etcd_encoding.rs(4 hunks)src/ocp_postprocess/ip_rename.rs(1 hunks)src/ocp_postprocess/ip_rename/etcd_rename.rs(1 hunks)src/ocp_postprocess/ip_rename/filesystem_rename.rs(1 hunks)telco5g-konflux(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- telco5g-konflux
- src/ocp_postprocess/ip_rename/filesystem_rename.rs
🧰 Additional context used
🧬 Code graph analysis (2)
src/ocp_postprocess/ip_rename/etcd_rename.rs (1)
src/k8s_etcd.rs (3)
key(303-303)get_etcd_json(316-327)put_etcd_yaml(329-334)
src/ocp_postprocess/ip_rename.rs (1)
src/ocp_postprocess/ip_rename/etcd_rename.rs (3)
fix_pods_status(600-693)original_ip(569-569)delete_minions_if_exist(695-700)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Red Hat Konflux / recert-4-21-on-pull-request
- GitHub Check: build
🔇 Additional comments (4)
src/etcd_encoding.rs (1)
8-8: LGTM! Pod support correctly added.The Pod type support follows the established pattern for other Kubernetes resources (ConfigMap, Node, Secret, etc.) and is properly integrated into both decode and encode functions.
Also applies to: 61-61, 93-93, 116-117
src/ocp_postprocess/ip_rename/etcd_rename.rs (2)
602-668: Well-structured IP replacement logic.The helper correctly handles all four Pod status IP fields (podIP, hostIP, podIPs, hostIPs). The defensive handling of both string and object formats in the array fields is good practice, though the Kubernetes API spec defines podIPs/hostIPs as arrays of objects with an "ip" field.
670-693: LGTM! Efficient parallel pod processing.The function correctly iterates over all pods, applies IP updates only when needed, and processes pods concurrently using
join_all. The conditional write-back optimization avoids unnecessary etcd writes.src/ocp_postprocess/ip_rename.rs (1)
152-158: LGTM! Logical placement in the IP rename flow.The two new steps are correctly positioned after all resource fixes and the etcd member update. This ensures pod statuses are updated and the old node is cleaned up for regeneration as the final steps of the IP configuration flow.
8518d0a to
78e3fff
Compare
… IP configuration is a new flow which will be introduced soon in LCA. The flow essentialy performs IP change in SNO clusters. To perform the change correctly, we need to change the IP in the pods statuses as well and delete the old node for regenration. IBU doesn't require this logic because it changes the hostname and in that case recert deletes the pods and node anyway.
78e3fff to
826b003
Compare
|
@danmanor: This pull request references MGMT-22237 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.21.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
src/ocp_postprocess/ip_rename/etcd_rename.rs (1)
695-700: LGTM! Minion deletion logic is correct.The function properly lists and deletes all minion keys. The naming (
delete_minions_if_exist) correctly signals that it's safe to call even if no minions are present.As with the pod status update, the past review comment "Also minions" questioning whether they're deleted elsewhere is addressed by the PR's context: the IP configuration flow requires deleting the old node for regeneration, unlike the IBU flow which handles it differently.
Note: The verification request in the first file's review will confirm these operations don't conflict with other code paths.
🧹 Nitpick comments (1)
src/ocp_postprocess/ip_rename.rs (1)
152-158: Consider consolidating pod and minion operations for dual-stack.In dual-stack configurations,
fix_etcd_resources_for_ip_pairis invoked once per IP (lines 197-202). This meansfix_pods_statuswill scan all pods twice (once for each IP replacement), anddelete_minions_if_existwill be called twice (though the second call is a no-op).For better performance in dual-stack scenarios, consider refactoring to:
- Collect all IP replacement pairs first
- Scan pods once and apply all IP replacements in a single pass
- Delete minions once after all IP updates
Example approach:
// In fix_etcd_resources_dual_stack, after the main loop: fix_pods_status_dual_stack(etcd_client, original_ips, new_ips).await?; delete_minions_if_exist(etcd_client).await?;However, the current implementation is functionally correct and the performance impact is likely minimal for SNO clusters.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
src/etcd_encoding.rs(4 hunks)src/ocp_postprocess/ip_rename.rs(1 hunks)src/ocp_postprocess/ip_rename/etcd_rename.rs(1 hunks)src/ocp_postprocess/ip_rename/filesystem_rename.rs(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- src/ocp_postprocess/ip_rename/filesystem_rename.rs
🧰 Additional context used
🧬 Code graph analysis (2)
src/ocp_postprocess/ip_rename.rs (1)
src/ocp_postprocess/ip_rename/etcd_rename.rs (3)
fix_pods_status(600-693)original_ip(569-569)delete_minions_if_exist(695-700)
src/ocp_postprocess/ip_rename/etcd_rename.rs (1)
src/k8s_etcd.rs (3)
key(303-303)get_etcd_json(316-327)put_etcd_yaml(329-334)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Red Hat Konflux / recert-4-21-on-pull-request
- GitHub Check: build
🔇 Additional comments (3)
src/etcd_encoding.rs (1)
8-8: LGTM!Pod support is correctly added to the encoding/decoding layer following the established pattern for other Kubernetes resource types. This enables the IP rename flow to properly handle Pod resources stored in etcd.
Also applies to: 61-61, 93-93, 117-117
src/ocp_postprocess/ip_rename/etcd_rename.rs (1)
600-693: LGTM! Pod status IP replacement logic is sound.The implementation correctly:
- Handles multiple Pod status IP field formats (podIP, hostIP, podIPs as strings or objects, hostIPs)
- Only mutates fields matching the original IP (conditional replacement)
- Gracefully handles missing fields with
if let Somepatterns- Processes pods in parallel via
join_allfor performance- Only writes back to etcd when changes are detected
Regarding the past review comment "Don't we delete all pods?" - the PR description clarifies that the IBU flow deletes pods (when hostname changes), but this IP configuration flow does not, hence these status updates are necessary.
src/ocp_postprocess/ip_rename.rs (1)
152-158: Review comment verification reveals inaccuracy in the stated concern.Based on code analysis:
IP rename flow (lines 152–158 in
ip_rename.rs) explicitly calls:
delete_minions_if_exist()— deletes all minion objectsfix_pods_status()— updates pod status IPs (no deletion)Cluster domain rename flow (line 221 in
cluster_domain_rename.rs) calls:
delete_resources()— deletes controller revisions, pod network connectivity checks, configmaps, and API request counts (filtered by specific namespaces/names)Key finding: These operations target entirely different resources:
- IP flow deletes minions (legacy Kubernetes resource type)
- Cluster domain flow deletes controller revisions, checkpoints, and lock objects (NOT minions, NOT pods)
delete_resourcesis not called in the IP rename flow (verified by grep)Conclusion: The review comment's premise ("IP configuration flow does not [delete resources]") contradicts the actual code. The IP flow does delete minions, and there is no conflict because each flow properly handles its own required cleanup on different resource types. The stated concern is not substantiated.
|
/lgtm |
|
/override ci/prow/e2e-aws-ovn-single-node-recert-parallel e2e-aws-ovn-single-node-recert-serial |
|
@danmanor: /override requires failed status contexts, check run or a prowjob name to operate on.
Only the following failed contexts/checkruns were expected:
If you are trying to override a checkrun that has a space in it, you must put a double quote on the context. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/override ci/prow/e2e-aws-ovn-single-node-recert-parallel |
|
@danmanor: Overrode contexts on behalf of danmanor: ci/prow/e2e-aws-ovn-single-node-recert-parallel DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/override ci/prow/e2e-aws-ovn-single-node-recert-serial |
|
@danmanor: Overrode contexts on behalf of danmanor: ci/prow/e2e-aws-ovn-single-node-recert-serial DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
6beeff0
into
rh-ecosystem-edge:main
|
/cherry-pick release-4.20 |
|
@danmanor: new pull request created: #789 DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
@danmanor: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
IP configuration is a new flow which will be introduced soon in LCA. The flow essentially performs IP change in SNO clusters. To perform the change correctly, we need to change the IP in the pods statuses as well and delete the old node for regeneration. IBU doesn't require this logic because it changes the hostname and in that case recert deletes the pods and node anyway.
This PR also adds the path to the quorum-restore-pod so its ips will be changed as well
Summary by CodeRabbit
New Features
Bug Fixes
✏️ Tip: You can customize this high-level summary in your review settings.