-
Notifications
You must be signed in to change notification settings - Fork 52
OCPCLOUD-3355: e2e testing automation: Implement Status Syncing for MAPI/CAPI migration #387
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@sunzhaohua2: This pull request references OCPCLOUD-3017 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
📝 WalkthroughWalkthroughAdds extensive e2e status-conversion tests for MAPI↔CAPI, YAML-based AWS providerStatus parsing and verification helpers, skip-wait MachineSet creation and readiness helpers, pervasive GinkgoHelper() insertions, a UniqueName utility, and extends the e2e Makefile timeout. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Test
participant MAPI_API as MAPI API Server
participant CAPI_API as CAPI API Server
participant Controllers as Sync Controllers
participant AWS as AWS providerStatus
Test->>MAPI_API: create MAPI MachineSet / Machine (maybe skipWait)
Test->>CAPI_API: create CAPI MachineSet / Machine (parallel)
MAPI_API->>Controllers: notify desired state / status changes
CAPI_API->>Controllers: notify desired state / status changes
Controllers->>AWS: reconcile provider resources, obtain providerStatus
AWS-->>Controllers: return AWSMachineProviderStatus
Controllers->>MAPI_API: update Machine.Status.ProviderStatus.Raw
Controllers->>CAPI_API: update MachineSet/Machine.Status and Conditions
Test->>MAPI_API: read ProviderStatus raw
Test->>CAPI_API: read Status / Conditions
Test->>Test: verify via YAML unmarshal and Gomega matchers
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (2.5.0)Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions Comment |
|
@sunzhaohua2: This pull request references OCPCLOUD-3017 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@sunzhaohua2: This pull request references OCPCLOUD-3017 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this: Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@sunzhaohua2: This pull request references OCPCLOUD-3017 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
e2e/machine_migration_helpers.go (1)
346-354: UsingExpectinside a transform function defeatsEventuallyretry semantics.When
getAWSProviderStatusFromMachineis used as a transform function withinEventually, anyExpectfailure will cause an immediate test failure instead of allowingEventuallyto retry. IfProviderStatus.Rawis nil or YAML unmarshalling fails on an early attempt (before status is populated), the test fails prematurely.Consider returning an error-indicating value that the matcher can handle, or restructure to allow retries.
🔎 Proposed fix
// getAWSProviderStatusFromMachine extracts and unmarshals the AWSMachineProviderStatus from a MAPI Machine. func getAWSProviderStatusFromMachine(mapiMachine *mapiv1beta1.Machine) *mapiv1beta1.AWSMachineProviderStatus { - Expect(mapiMachine.Status.ProviderStatus.Raw).ToNot(BeNil()) + if mapiMachine.Status.ProviderStatus == nil || mapiMachine.Status.ProviderStatus.Raw == nil { + return nil + } providerStatus := &mapiv1beta1.AWSMachineProviderStatus{} - Expect(yaml.Unmarshal(mapiMachine.Status.ProviderStatus.Raw, providerStatus)).To(Succeed()) + if err := yaml.Unmarshal(mapiMachine.Status.ProviderStatus.Raw, providerStatus); err != nil { + return nil + } return providerStatus }Then update the matcher usage to handle nil:
verifyMAPIMachineProviderStatus(mapiMachine, SatisfyAll( Not(BeNil()), HaveField("InstanceState", HaveValue(Equal(...))), ), )
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to data retention organization setting
📒 Files selected for processing (2)
e2e/machine_migration_helpers.goe2e/status_conversion_test.go
🔇 Additional comments (12)
e2e/machine_migration_helpers.go (1)
22-22: LGTM!Import addition is appropriate for the new YAML unmarshalling functionality.
e2e/status_conversion_test.go (11)
1-21: LGTM!Imports are well-organized and appropriate for the comprehensive status conversion testing.
23-32: LGTM!Proper platform and feature gate guards ensure tests only run in appropriate environments.
34-57: LGTM!Well-structured test setup with proper cleanup via
DeferCleanup.
66-84: Consider potential stale reference issue.Line 75 compares against
mapiMachineSet.Status.ReadyReplicaswhich is captured when theEventuallyon line 68-71 completes. However, sincekomega.Object(mapiMachineSet)updates the object in place, this should work correctly. Just noting this pattern for awareness.
138-150: Good documentation.The comment clearly explains why only the Paused condition is verified for paused MachineSets.
461-491: Good practice: Pending tests with bug references.Using
PItwith explicit bug URLs (OCPBUGS-63340) is the right approach for known issues that are being tracked.
557-568: Address comparison assumes ordering.The index-based address comparison (
cm.Status.Addresses[i]vsmapiMachine.Status.Addresses[i]) will fail if addresses are returned in different orders between MAPI and CAPI. If ordering isn't guaranteed by the conversion logic, consider usingContainElementsor sorting before comparison.This is acceptable if the conversion preserves order, but worth verifying.
647-667: LGTM!Good coverage of providerStatus conversion including instanceState verification.
765-791: Pending test appropriately marked with bug reference.The
PItwith OCPBUGS-70136 reference correctly marks this as pending until the underlying bug is fixed.
777-790: Note: This test uses the helper function with theExpectissue.When this pending test is enabled, be aware that
verifyMAPIMachineProviderStatususesgetAWSProviderStatusFromMachinewhich hasExpectassertions inside a transform function. Applying the fix suggested inmachine_migration_helpers.gowill be needed.
805-816: Good use ofPItfor pending tests with bug tracking.Multiple tests appropriately marked as pending with OCPBUGS-63183 reference for the same-name scenario.
Also applies to: 832-842, 844-869
|
/test e2e-aws-capi-techpreview |
|
@sunzhaohua2: This pull request references OCPCLOUD-3017 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
e2e/status_conversion_test.go (2)
557-570: Consider order-independent address comparison.The current comparison assumes addresses are in the same order in both MAPI and CAPI. If address ordering is not guaranteed to be preserved during conversion, this could cause flaky test failures.
🔎 Suggested order-independent comparison
Eventually(komega.Object(capiMachine), capiframework.WaitMedium, capiframework.RetryMedium).Should( WithTransform(func(cm *clusterv1.Machine) bool { if len(cm.Status.Addresses) != len(mapiMachine.Status.Addresses) { return false } - for i, mapiAddr := range mapiMachine.Status.Addresses { - capiAddr := cm.Status.Addresses[i] - if string(capiAddr.Type) != string(mapiAddr.Type) || capiAddr.Address != mapiAddr.Address { - return false + for _, mapiAddr := range mapiMachine.Status.Addresses { + found := false + for _, capiAddr := range cm.Status.Addresses { + if string(capiAddr.Type) == string(mapiAddr.Type) && capiAddr.Address == mapiAddr.Address { + found = true + break + } + } + if !found { + return false } } return true }, BeTrue()),
736-749: Consider order-independent address comparison here as well.Same concern as the MAPI→CAPI address comparison - the index-based comparison assumes preserved ordering.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to data retention organization setting
📒 Files selected for processing (2)
e2e/machine_migration_helpers.goe2e/status_conversion_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
- e2e/machine_migration_helpers.go
🧰 Additional context used
🧬 Code graph analysis (1)
e2e/status_conversion_test.go (5)
e2e/framework/util.go (1)
IsMachineAPIMigrationEnabled(41-76)pkg/conversion/mapi2capi/interface.go (2)
MachineSet(29-31)Machine(24-26)e2e/framework/framework.go (7)
WaitLong(25-25)RetryLong(19-19)WaitMedium(24-24)RetryMedium(18-18)WaitShort(23-23)RetryShort(17-17)CAPINamespace(14-14)e2e/framework/machine.go (2)
GetMachine(75-86)GetAWSMachine(61-72)e2e/framework/machineset.go (3)
WaitForMachineSet(137-173)ScaleCAPIMachineSet(224-235)GetNewestMachineFromMachineSet(204-221)
🔇 Additional comments (8)
e2e/status_conversion_test.go (8)
1-21: LGTM!Imports are well-organized and appropriate for the e2e test suite. The use of dot imports for Ginkgo/Gomega is a common pattern in Kubernetes e2e tests.
23-32: LGTM!Good use of skip conditions to ensure tests only run on AWS platforms with the MachineAPIMigration feature enabled. The
Ordereddecorator ensures deterministic test execution.
66-84: LGTM!The pattern of waiting for MAPI status to stabilize and then verifying CAPI synchronization is correct. The
komega.Object()matcher refreshes the object on each poll, ensuring fresh values are used.
205-230: LGTM!The error status conversion test correctly handles the case where either
ErrorReasonorErrorMessage(or both) may be set. The conditional assertions ensure proper matching without failing if only one field is populated.
461-490: LGTM on pending tests with bug tracking.Good practice to mark tests as pending (
PIt) with inline bug references (OCPBUGS-63340). This maintains test coverage visibility while avoiding false negatives until the underlying issues are resolved.
794-803: LGTM!The cross-authority scenario (CAPI Machine created first, then MAPI with ClusterAPI authority) is properly set up. Individual test cases appropriately wait for the required status fields to be populated.
34-246: Well-structured test suite for MAPI→CAPI MachineSet conversions.The test organization is clean with:
- Proper resource lifecycle management (BeforeAll setup, DeferCleanup teardown)
- Clear separation between normal operation and error condition tests
- Comprehensive coverage of status fields, conditions, and selector conversion
304-309: Access toDeprecated.V1Beta1should use defensive patterns or explicit nil checks.Direct access to
capiMachineSet.Status.Deprecated.V1Beta1.FullyLabeledReplicasat lines 306 and 397 may panic if the pointer is not initialized. UseHaveValue()matcher (as line 81 does forReadyReplicas) or add a nil guard in test setup to ensure the field is initialized before the test assertions.
|
/test unit |
|
@sunzhaohua2: This pull request references OCPCLOUD-3017 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
Makefile (1)
60-60: Timeout increase appears reasonable given the new test coverage.The 50% increase in e2e test timeout (120m → 180m) aligns with the comprehensive status conversion test suite being added. However, consider verifying that the full runtime approaches this limit, as unnecessarily long timeouts can delay failure feedback in CI.
Optional: If the new tests are the primary driver of increased runtime, consider whether any tests could be parallelized or optimized to reduce overall execution time.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to data retention organization setting
📒 Files selected for processing (1)
Makefile
|
/test e2e-aws-capi-techpreview |
|
@sunzhaohua2 is it ready for review? |
yes |
| }) | ||
|
|
||
| It("should convert status.availableReplicas from MAPI to CAPI status and status.Deprecated.V1Beta1", func() { | ||
| By("Verifying CAPI MachineSet status.availableReplicas is synchronized") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious about this (thinking out loud) - we're waiting for the field to be populated, and once it is we are stating it's synchronised. This works because we start with no CAPI machine set, right, so population == sync?
Should we be looking at SynchronizedGeneration too? Given that the By( says we're waiting for synchronisation (and that's a field in status)? We likely only need to do it once, in the BeforeAll? and it may speed up the tests a little?
| }) | ||
|
|
||
| // bug https://issues.redhat.com/browse/OCPBUGS-63183 | ||
| It("should convert CAPI Machine phase to MAPI Machine phase", func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be a PIt() ? Or has the bug been resolved?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:) I removed the comment, we hit the bug only when the capi machine exists
| .PHONY: e2e | ||
| e2e: ## Run e2e tests against active kubeconfig | ||
| ./hack/test.sh "./e2e/..." 120m | ||
| ./hack/test.sh "./e2e/..." 180m |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:'(
Should we consider splitting the e2es / sharing early?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@theobarberbany I raised a pr here to add shard_count openshift/release#73182
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@theobarberbany I'd like to keep it as is for now to align with the job's time setting, I'll add the shard_count parameter to the job after the OTE pr is merged.
|
@sunzhaohua2: This pull request references OCPCLOUD-3017 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/test unit |
|
/test e2e-aws-capi-techpreview |
|
@sunzhaohua2: This pull request references OCPCLOUD-3017 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/test e2e-aws-capi-techpreview |
|
@sunzhaohua2: This pull request references OCPCLOUD-3355 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@sunzhaohua2: This pull request references OCPCLOUD-3355 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/test e2e-aws-capi-techpreview |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
e2e/machineset_migration_helpers.go (1)
98-103:⚠️ Potential issue | 🟡 MinorFail fast on unknown MachineAuthority values.
Without a default case, unexpected values silently skip readiness waits and can mask test issues.
🛠️ Suggested fix
@@ switch machineAuthority { case mapiv1beta1.MachineAuthorityMachineAPI: mapiframework.WaitForMachineSet(ctx, cl, machineSetName) case mapiv1beta1.MachineAuthorityClusterAPI: capiframework.WaitForMachineSet(cl, machineSetName, capiframework.CAPINamespace) +default: + Fail(fmt.Sprintf("unknown authoritativeAPI type: %v", machineAuthority)) } @@ switch machineAuthority { case mapiv1beta1.MachineAuthorityMachineAPI: mapiframework.WaitForMachineSet(ctx, cl, machineSetName) case mapiv1beta1.MachineAuthorityClusterAPI: capiframework.WaitForMachineSet(cl, machineSetName, capiframework.CAPINamespace) +default: + Fail(fmt.Sprintf("unknown authoritativeAPI type: %v", machineAuthority)) }Also applies to: 121-126
🤖 Fix all issues with AI agents
In `@e2e/status_conversion_test.go`:
- Around line 110-146: The assertions wrap non-pointer int32 replica fields with
HaveValue causing failures; update the checks against mapiMSAuthMAPI and
capiMSMirrorMAPI so that Status.ReadyReplicas, Status.AvailableReplicas and
Status.Deprecated.V1Beta1.{ReadyReplicas,AvailableReplicas} are matched directly
(e.g., BeNumerically(">", 0) or Equal(...)) instead of HaveValue(Equal(...));
keep the existing HaveValue usage for CAPI v1beta2 pointer fields (the
Status.ReadyReplicas checks that currently use HaveValue should remain).
|
@sunzhaohua2: This pull request references OCPCLOUD-3355 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@sunzhaohua2: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
/test e2e-aws-capi-techpreview |
|
@damdo I updated this pr to create machineset/machine in parallel, checked log, Status Conversion Tests time was reduced about 24 minutes. In the new job 2018512646269046784 Status Conversion Tests took about 32m. In the previous job 2016986669105811456 Status Conversion Tests took about 56 m. I also raised a pr to optimize migration e2e to reduce running time #451 ci/prow/e2e-aws-capi-techpreview e2e duration reduced from ~4h10m to 3h15m, this also covered this pr's status testing. |
|
/assign @damdo @theobarberbany @nrb |
|
The changes for speeding up the test times look good to me. /lgtm |
|
Tests from second stage were triggered manually. Pipeline can be controlled only manually, until HEAD changes. Use command to trigger second stage. |
Summary by CodeRabbit