-
Notifications
You must be signed in to change notification settings - Fork 52
OCPCLOUD-3354: Optimize migration e2e to reduce running time #451
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Pipeline controller notification For optional jobs, comment This repository is configured in: LGTM mode |
|
@sunzhaohua2: This pull request references OCPCLOUD-3354 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.22.0" version, but no target version was set. DetailsIn response to this: Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/hold |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
📝 WalkthroughWalkthroughCentralizes e2e test fixtures with shared BeforeAll/AfterAll, adds UniqueName utility, extends helpers for YAML providerStatus and Ginkgo integration, adds skip-wait MachineSet helpers, introduces a large status-conversion e2e suite, and increases e2e timeout from 120m to 180m. Changes
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (2.5.0)Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions Comment |
|
/test e2e-aws-capi-techpreview |
|
@sunzhaohua2: This pull request references OCPCLOUD-3354 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
e2e/migration_common.go (1)
3-24:⚠️ Potential issue | 🟡 MinorUniqueName isn’t guaranteed unique under heavy parallelism.
UnixNano can collide when multiple goroutines call within the same nanosecond, which can create occasional “AlreadyExists” flakes. Consider adding a per-process counter (or random suffix) to make collisions practically impossible.💡 Suggested fix (monotonic counter suffix)
import ( "fmt" + "sync/atomic" "time" ) +var uniqueCounter atomic.Uint64 + // UniqueName generates a unique name with the given prefix using nanosecond timestamp. // This is useful for creating resources in parallel tests without naming conflicts. func UniqueName(prefix string) string { - return fmt.Sprintf("%s-%d", prefix, time.Now().UnixNano()) + return fmt.Sprintf("%s-%d-%d", prefix, time.Now().UnixNano(), uniqueCounter.Add(1)) }
🤖 Fix all issues with AI agents
In `@e2e/machine_migration_helpers.go`:
- Around line 356-363: The function getAWSProviderStatusFromMachine dereferences
mapiMachine.Status.ProviderStatus.Raw without checking for nil, causing a panic
when ProviderStatus is not yet populated; modify getAWSProviderStatusFromMachine
to first check if mapiMachine.Status.ProviderStatus is nil (and/or if .Raw is
nil/empty) and handle that case by returning nil or a clear test failure using
Expect with a helpful message, before attempting yaml.Unmarshal into
mapiv1beta1.AWSMachineProviderStatus.
In `@e2e/machineset_migration_capi_authoritative_test.go`:
- Around line 100-117: The test is fragile because it compares slices by fixed
index; replace index-based equality checks between mapiMachines and capiMachines
(variables mapiMachines, capiMachines, mapiMachinesDelete, capiMachinesDelete)
with membership/set-based assertions: extract names from the CAPI slices and
assert each MAPI machine's name is present using
Expect(...).To(ContainElement(...)) or compare as sets/maps, and for
firstMAPIMachine assign the MAPI machine found by matching a name from
capiMachines (not mapiMachines[0]) so the selected machine is stable
irrespective of slice ordering.
In `@e2e/machineset_migration_mapi_authoritative_test.go`:
- Around line 106-114: The test currently assumes ordering by comparing
capiMachines[0].Name to mapiMachines[0].Name (using
mapiframework.GetMachinesFromMachineSet and
capiframework.GetMachinesFromMachineSet), which can flake; instead, assert
membership by checking that every MAPI machine name appears in the CAPI machines
(or vice‑versa) using a set/contains check on names, and then pick a matching
MAPI machine (e.g., the one whose name exists in the CAPI name set) to assign to
firstMAPIMachine; update the Expect assertions to use membership-based checks
rather than index-based equality.
In `@e2e/status_conversion_test.go`:
- Around line 629-648: The test reads
mapiMachineAuthMAPI.Status.ProviderStatus.Raw without guarding for nil which can
panic; before calling yaml.Unmarshal into mapiProviderStatus, add a nil/empty
check or an Eventually/Expect that waits for
mapiMachineAuthMAPI.Status.ProviderStatus (and/or ProviderStatus.Raw) to be
non-nil/non-empty (referencing mapiMachineAuthMAPI.Status.ProviderStatus.Raw and
the yaml.Unmarshal call into mapiProviderStatus) and only then perform the
Unmarshal and subsequent assertions.
| It("should convert MAPI Machine providerStatus to CAPI AWSMachine status", func() { | ||
| By("Getting AWSMachine for the CAPI Machine") | ||
| awsMachine := capiframework.GetAWSMachine(cl, capiMachineMirrorMAPI.Name, capiframework.CAPINamespace) | ||
| Expect(awsMachine).NotTo(BeNil(), "AWSMachine should exist") | ||
|
|
||
| By("Verifying AWSMachine status.ready is set based on MAPI Machine providerStatus.instanceState") | ||
| Eventually(komega.Object(awsMachine), capiframework.WaitMedium, capiframework.RetryMedium).Should( | ||
| HaveField("Status.Ready", BeTrue()), | ||
| "Should have AWSMachine status.ready be true when instance is running", | ||
| ) | ||
|
|
||
| By("Verifying AWSMachine status.instanceState matches MAPI Machine providerStatus.instanceState") | ||
| var mapiProviderStatus mapiv1beta1.AWSMachineProviderStatus | ||
| err := yaml.Unmarshal(mapiMachineAuthMAPI.Status.ProviderStatus.Raw, &mapiProviderStatus) | ||
| Expect(err).ToNot(HaveOccurred()) | ||
|
|
||
| Eventually(komega.Object(awsMachine), capiframework.WaitMedium, capiframework.RetryMedium).Should( | ||
| HaveField("Status.InstanceState", HaveValue(Equal(awsv1.InstanceState(ptr.Deref(mapiProviderStatus.InstanceState, ""))))), | ||
| "Should have AWSMachine status.instanceState match MAPI providerStatus.instanceState", | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a nil guard before reading ProviderStatus.Raw.
If ProviderStatus hasn’t been populated yet, dereferencing .Raw will panic and hard-fail the test.
🛡️ Suggested fix
- var mapiProviderStatus mapiv1beta1.AWSMachineProviderStatus
- err := yaml.Unmarshal(mapiMachineAuthMAPI.Status.ProviderStatus.Raw, &mapiProviderStatus)
+ Expect(mapiMachineAuthMAPI.Status.ProviderStatus).NotTo(BeNil())
+ var mapiProviderStatus mapiv1beta1.AWSMachineProviderStatus
+ err := yaml.Unmarshal(mapiMachineAuthMAPI.Status.ProviderStatus.Raw, &mapiProviderStatus)
Expect(err).ToNot(HaveOccurred())📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| It("should convert MAPI Machine providerStatus to CAPI AWSMachine status", func() { | |
| By("Getting AWSMachine for the CAPI Machine") | |
| awsMachine := capiframework.GetAWSMachine(cl, capiMachineMirrorMAPI.Name, capiframework.CAPINamespace) | |
| Expect(awsMachine).NotTo(BeNil(), "AWSMachine should exist") | |
| By("Verifying AWSMachine status.ready is set based on MAPI Machine providerStatus.instanceState") | |
| Eventually(komega.Object(awsMachine), capiframework.WaitMedium, capiframework.RetryMedium).Should( | |
| HaveField("Status.Ready", BeTrue()), | |
| "Should have AWSMachine status.ready be true when instance is running", | |
| ) | |
| By("Verifying AWSMachine status.instanceState matches MAPI Machine providerStatus.instanceState") | |
| var mapiProviderStatus mapiv1beta1.AWSMachineProviderStatus | |
| err := yaml.Unmarshal(mapiMachineAuthMAPI.Status.ProviderStatus.Raw, &mapiProviderStatus) | |
| Expect(err).ToNot(HaveOccurred()) | |
| Eventually(komega.Object(awsMachine), capiframework.WaitMedium, capiframework.RetryMedium).Should( | |
| HaveField("Status.InstanceState", HaveValue(Equal(awsv1.InstanceState(ptr.Deref(mapiProviderStatus.InstanceState, ""))))), | |
| "Should have AWSMachine status.instanceState match MAPI providerStatus.instanceState", | |
| ) | |
| It("should convert MAPI Machine providerStatus to CAPI AWSMachine status", func() { | |
| By("Getting AWSMachine for the CAPI Machine") | |
| awsMachine := capiframework.GetAWSMachine(cl, capiMachineMirrorMAPI.Name, capiframework.CAPINamespace) | |
| Expect(awsMachine).NotTo(BeNil(), "AWSMachine should exist") | |
| By("Verifying AWSMachine status.ready is set based on MAPI Machine providerStatus.instanceState") | |
| Eventually(komega.Object(awsMachine), capiframework.WaitMedium, capiframework.RetryMedium).Should( | |
| HaveField("Status.Ready", BeTrue()), | |
| "Should have AWSMachine status.ready be true when instance is running", | |
| ) | |
| By("Verifying AWSMachine status.instanceState matches MAPI Machine providerStatus.instanceState") | |
| Expect(mapiMachineAuthMAPI.Status.ProviderStatus).NotTo(BeNil()) | |
| var mapiProviderStatus mapiv1beta1.AWSMachineProviderStatus | |
| err := yaml.Unmarshal(mapiMachineAuthMAPI.Status.ProviderStatus.Raw, &mapiProviderStatus) | |
| Expect(err).ToNot(HaveOccurred()) | |
| Eventually(komega.Object(awsMachine), capiframework.WaitMedium, capiframework.RetryMedium).Should( | |
| HaveField("Status.InstanceState", HaveValue(Equal(awsv1.InstanceState(ptr.Deref(mapiProviderStatus.InstanceState, ""))))), | |
| "Should have AWSMachine status.instanceState match MAPI providerStatus.instanceState", | |
| ) |
🤖 Prompt for AI Agents
In `@e2e/status_conversion_test.go` around lines 629 - 648, The test reads
mapiMachineAuthMAPI.Status.ProviderStatus.Raw without guarding for nil which can
panic; before calling yaml.Unmarshal into mapiProviderStatus, add a nil/empty
check or an Eventually/Expect that waits for
mapiMachineAuthMAPI.Status.ProviderStatus (and/or ProviderStatus.Raw) to be
non-nil/non-empty (referencing mapiMachineAuthMAPI.Status.ProviderStatus.Raw and
the yaml.Unmarshal call into mapiProviderStatus) and only then perform the
Unmarshal and subsequent assertions.
4d785aa to
20ae306
Compare
|
/test e2e-aws-capi-techpreview |
|
@sunzhaohua2: This pull request references OCPCLOUD-3354 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
20ae306 to
bfdcece
Compare
|
/test e2e-aws-capi-techpreview |
|
@sunzhaohua2: This pull request references OCPCLOUD-3354 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
bfdcece to
58501da
Compare
|
/test e2e-aws-capi-techpreview |
|
@sunzhaohua2: This pull request references OCPCLOUD-3354 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
ci/prow/e2e-aws-capi-techpreview e2e duration changed from ~4h10m to 3h15m. |
|
@sunzhaohua2: This pull request references OCPCLOUD-3354 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
damdo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/assign @RadekManak @nrb
|
Depends on #387 |
|
@sunzhaohua2: This pull request references OCPCLOUD-3354 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
|
||
| // getAWSProviderStatusFromMachine extracts and unmarshals the AWSMachineProviderStatus from a MAPI Machine. | ||
| func getAWSProviderStatusFromMachine(mapiMachine *mapiv1beta1.Machine) *mapiv1beta1.AWSMachineProviderStatus { | ||
| Expect(mapiMachine.Status.ProviderStatus.Raw).ToNot(BeNil()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could call GinkgoHelper() here as well or any specific reason not using it here ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah got it , Nolan already did , thanks Dam
| .PHONY: e2e | ||
| e2e: ## Run e2e tests against active kubeconfig | ||
| ./hack/test.sh "./e2e/..." 120m | ||
| ./hack/test.sh "./e2e/..." 180m |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for CI we may want to increase this bit more ? May be we can conclude after all pre-submits are passed like 20% more time than they took ?
|
/lgtm |
|
Scheduling tests matching the |
|
@sunzhaohua2: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Stacked on top of: #387
--
Try to create all resources together (parallel creation, saving time), and in each Describe block deletes its own resources immediately after testing (saving resources and costs).
ci/prow/e2e-aws-capi-techpreview e2e duration changed from ~4h10m to 3h15m.
Summary by CodeRabbit
Tests
New Features
Chores