Skip to content

Conversation

@RadekManak
Copy link
Contributor

@RadekManak RadekManak commented Oct 3, 2025

Why:

Adds .status.synchronizedAPI field to Machine and MachineSet resources to enable reliable migration cancellation when migrations get stuck at status.authoritativeAPI: Migrating. Without this field, the system cannot determine which API was the migration source when users revert spec.authoritativeAPI, preventing proper rollback to the last known good state. Implements OCPCLOUD-2998.

What:

  • Adds .status.synchronizedAPI field (values: "" | MachineAPI | ClusterAPI) to track the last successfully synchronized API
  • Implements handleMigrationStatusInitialization() in migration controllers to bootstrap empty status fields with proper inference logic
  • Adds IsMigrationCancellationRequested() detection when spec.authoritativeAPI matches status.synchronizedAPI while status.authoritativeAPI == Migrating
  • Updates ApplyMigrationStatus() helpers to atomically set both authoritativeAPI and synchronizedAPI during state transitions

How can it be used:

Administrators can cancel stuck migrations by reverting spec.authoritativeAPI back to the previously synchronized state:

# Migration stuck in progress
status:
  authoritativeAPI: Migrating
  synchronizedAPI: MachineAPI  # Last good state

# Cancel by reverting spec
spec:
  authoritativeAPI: MachineAPI  # Matches synchronizedAPI

# System detects cancellation and rolls back
status:
  authoritativeAPI: MachineAPI
  synchronizedAPI: MachineAPI

The migration controller detects this pattern and transitions back to the synchronized state without requiring manual intervention.

How did you test it:

Unit tests cover status initialization scenarios (both fields empty, only one empty, mid-migration inference), migration cancellation detection logic, and rollback flows.

Adds e2e tests to verify field behavior during migrations.

Notes for the reviewer:

Requires companion PR openshift/machine-api-operator#1442 for API definition vendoring. This PR description was generated with AI assistance.

Summary by CodeRabbit

Release Notes

  • New Features

    • Added API synchronization state tracking to monitor which authority (MachineAPI or ClusterAPI) a resource is synchronized with during migration operations.
    • Implemented migration cancellation logic to automatically revert in-progress migrations when authorities align, improving reliability.
  • Tests

    • Expanded end-to-end test coverage to verify API synchronization state across machine and machineset migration scenarios.
    • Enhanced unit tests to validate synchronization status during authority transitions.

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 3, 2025
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 3, 2025

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 3, 2025
@coderabbitai
Copy link

coderabbitai bot commented Oct 3, 2025

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This pull request introduces SynchronizedAPI field tracking and migration cancellation logic across machine migration and sync controllers. Changes include propagating synchronization state through MAPI and CAPI resources, detecting cancellation scenarios to rollback migrations, adding helper methods for status updates, and extending E2E tests to verify API synchronization assertions across migration workflows.

Changes

Cohort / File(s) Summary
E2E Machine Migration Tests
e2e/machine_migration_capi_authoritative_test.go, e2e/machine_migration_mapi_authoritative_test.go
Adds multiple verifyMachineSynchronizedAPI calls to validate synchronization state during machine migration round-trip scenarios across ClusterAPI and MachineAPI authorities.
E2E MachineSet Migration Tests
e2e/machineset_migration_capi_authoritative_test.go, e2e/machineset_migration_mapi_authoritative_test.go
Adds verifyMachineSetSynchronizedAPI assertions to verify MachineSet synchronization state during authoritative API transitions.
E2E Test Helpers
e2e/machine_migration_helpers.go, e2e/machineset_migration_helpers.go
Introduces verifyMachineSynchronizedAPI and verifyMachineSetSynchronizedAPI helpers; refactors provider spec verification with WithTransform in machineset helpers.
Machine Migration Controller Core Logic
pkg/controllers/machinemigration/machine_migration_controller.go
Adds migration cancellation detection, error sentinels (errInvalidSynchronizedAPI, errInfraObjectAssertion), new status patch wrappers (applyMigrationStatusWithPatch, applyMigrationStatusAndResetSyncStatusWithPatch), and updates isSynchronized to depend on Status.SynchronizedAPI; includes unpausing/pausing helpers for Cluster API resources.
Machine Migration Controller Tests
pkg/controllers/machinemigration/machine_migration_controller_test.go
Extends test builders with WithSynchronizedAPIStatus; propagates and verifies SynchronizedAPI alongside AuthoritativeAPI across numerous migration scenarios; reworks expectation blocks using cmp.Diff for comprehensive state comparison.
MachineSet Migration Controller Core Logic
pkg/controllers/machinesetmigration/machineset_migration_controller.go
Introduces migration cancellation path, replaces status patch methods with applyMigrationStatusWithPatch and applyMigrationStatusAndResetSyncStatusWithPatch, updates isSynchronized to use Status.SynchronizedAPI, and adds new authority-aware helper methods.
MachineSet Migration Controller Tests
pkg/controllers/machinesetmigration/machineset_migration_controller_test.go
Expands test descriptions to include SynchronizedAPI verification; adds mirror CAPI machine set scenarios with paused annotations; reworks reconciliation paths to validate SynchronizedAPI state transitions during migration flows.
Machine/MachineSet Sync Controllers
pkg/controllers/machinesync/machine_sync_controller.go, pkg/controllers/machinesetsync/machineset_sync_controller.go
Propagates SynchronizedAPI from existing MAPI objects to converted objects; updates applySynchronizedConditionWithPatch to pass SynchronizedAPI to status configuration.
Sync Controller Tests
pkg/controllers/machinesync/machine_sync_controller_test.go, pkg/controllers/machinesetsync/machineset_sync_controller_test.go
Adds test assertions verifying SynchronizedAPI is set to MachineAPISynchronized or ClusterAPISynchronized across varying condition statuses and authority configurations; introduces nested Contexts for AuthoritativeAPI scenarios.
Sync Common Utilities
pkg/controllers/synccommon/applyconfiguration.go, pkg/controllers/synccommon/migratestatus.go, pkg/controllers/synccommon/syncstatus.go
Adds WithSynchronizedAPI method to syncStatusApplyConfiguration; renames ApplyAuthoritativeAPI/ApplyAuthoritativeAPIAndResetSyncStatus functions; introduces IsMigrationCancellationRequested and AuthoritativeAPIToSynchronizedAPI mapping functions; adds MigrationDirection helper and SynchronizedAPI↔MachineAuthority bidirectional mapping.
Build and Dependencies
Makefile, go.mod, e2e/go.mod
Updates github.com/openshift/api and github.com/openshift/client-go dependencies; modifies Makefile bin/% target prerequisite handling from order-only to mixed form with FORCE as normal prerequisite.
Fuzzing Configuration
pkg/conversion/test/fuzz/fuzz.go
Marks SynchronizedAPI as intentionally empty during fuzzing in MAPIMachineFuzzerFuncs and MAPIMachineSetFuzzerFuncs with comments indicating no 1:1 CAPI mapping.

Sequence Diagram

sequenceDiagram
    participant Reconciler as Migration<br/>Reconciler
    participant StatusAPI as Status<br/>Management
    participant OldAuth as Old Authority<br/>(MAPI/CAPI)
    participant NewAuth as New Authority<br/>(CAPI/MAPI)

    Reconciler->>Reconciler: Check if Status.AuthoritativeAPI<br/>== Spec.AuthoritativeAPI
    alt Authorities Aligned
        Reconciler->>StatusAPI: Detect Migration<br/>Cancellation
        Reconciler->>OldAuth: Unpause Resources
        Reconciler->>StatusAPI: Reset SynchronizedAPI via<br/>applyMigrationStatusAndResetSyncStatusWithPatch
        Reconciler->>Reconciler: Log Cancellation
    else Migration In Progress
        Reconciler->>Reconciler: Update Status.AuthoritativeAPI<br/>via applyMigrationStatusWithPatch
        Reconciler->>OldAuth: Request Pause
        Reconciler->>NewAuth: Check Unpaused State
        Reconciler->>Reconciler: Check isSynchronized<br/>via Status.SynchronizedAPI
        alt Synchronized
            Reconciler->>NewAuth: Unpause Resources
            Reconciler->>StatusAPI: Finalize Migration
        else Not Synchronized
            Reconciler->>Reconciler: Wait for Sync
        end
    end
Loading

Estimated Code Review Effort

🎯 4 (Complex) | ⏱️ ~60 minutes


🐰 Migrations now dance with synchronized grace,
Cancellations caught mid-pace!
APIs track their state with care,
From MAPI to CAPI through the air.
Status fields now fully awake,
Changes propagate for migration's sake! 🌟

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title 'OCPCLOUD-2998: implement synchronizedAPI' clearly and specifically describes the main change: implementing a new synchronizedAPI feature, with a reference to the tracking ticket.
Docstring Coverage ✅ Passed Docstring coverage is 84.21% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

No actionable comments were generated in the recent review. 🎉

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.5.0)

Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions
The command is terminated due to an error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions


Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 6, 2025
@damdo damdo changed the title Draft: implement synchronizedAPI OCPCLOUD-2998: Draft: implement synchronizedAPI Oct 6, 2025
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Oct 6, 2025
@openshift-ci-robot
Copy link

openshift-ci-robot commented Oct 6, 2025

@RadekManak: This pull request references OCPCLOUD-2998 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set.

Details

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link

openshift-ci-robot commented Dec 4, 2025

@RadekManak: This pull request references OCPCLOUD-2998 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set.

Details

In response to this:

Why:

Adds .status.synchronizedAPI field to Machine and MachineSet resources to enable reliable migration cancellation when migrations get stuck at status.authoritativeAPI: Migrating. Without this field, the system cannot determine which API was the migration source when users revert spec.authoritativeAPI, preventing proper rollback to the last known good state. Implements OCPCLOUD-2998.

What:

  • Adds .status.synchronizedAPI field (values: "" | MachineAPI | ClusterAPI) to track the last successfully synchronized API
  • Implements handleMigrationStatusInitialization() in migration controllers to bootstrap empty status fields with proper inference logic
  • Adds IsMigrationCancellationRequested() detection when spec.authoritativeAPI matches status.synchronizedAPI while status.authoritativeAPI == Migrating
  • Updates ApplyMigrationStatus() helpers to atomically set both authoritativeAPI and synchronizedAPI during state transitions

How can it be used:

Administrators can cancel stuck migrations by reverting spec.authoritativeAPI back to the previously synchronized state:

# Migration stuck in progress
status:
 authoritativeAPI: Migrating
 synchronizedAPI: MachineAPI  # Last good state

# Cancel by reverting spec
spec:
 authoritativeAPI: MachineAPI  # Matches synchronizedAPI

# System detects cancellation and rolls back
status:
 authoritativeAPI: MachineAPI
 synchronizedAPI: MachineAPI

The migration controller detects this pattern and transitions back to the synchronized state without requiring manual intervention.

How did you test it:

Unit tests cover status initialization scenarios (both fields empty, only one empty, mid-migration inference), migration cancellation detection logic, and rollback flows.

TODO: Add e2e tests to verify field behavior during actual migrations.

Notes for the reviewer:

Requires companion PR openshift/machine-api-operator#1442 for API definition vendoring. This PR description was generated with AI assistance.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 4, 2025
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 5, 2025
@openshift-ci-robot
Copy link

openshift-ci-robot commented Dec 5, 2025

@RadekManak: This pull request references OCPCLOUD-2998 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set.

Details

In response to this:

Why:

Adds .status.synchronizedAPI field to Machine and MachineSet resources to enable reliable migration cancellation when migrations get stuck at status.authoritativeAPI: Migrating. Without this field, the system cannot determine which API was the migration source when users revert spec.authoritativeAPI, preventing proper rollback to the last known good state. Implements OCPCLOUD-2998.

What:

  • Adds .status.synchronizedAPI field (values: "" | MachineAPI | ClusterAPI) to track the last successfully synchronized API
  • Implements handleMigrationStatusInitialization() in migration controllers to bootstrap empty status fields with proper inference logic
  • Adds IsMigrationCancellationRequested() detection when spec.authoritativeAPI matches status.synchronizedAPI while status.authoritativeAPI == Migrating
  • Updates ApplyMigrationStatus() helpers to atomically set both authoritativeAPI and synchronizedAPI during state transitions

How can it be used:

Administrators can cancel stuck migrations by reverting spec.authoritativeAPI back to the previously synchronized state:

# Migration stuck in progress
status:
 authoritativeAPI: Migrating
 synchronizedAPI: MachineAPI  # Last good state

# Cancel by reverting spec
spec:
 authoritativeAPI: MachineAPI  # Matches synchronizedAPI

# System detects cancellation and rolls back
status:
 authoritativeAPI: MachineAPI
 synchronizedAPI: MachineAPI

The migration controller detects this pattern and transitions back to the synchronized state without requiring manual intervention.

How did you test it:

Unit tests cover status initialization scenarios (both fields empty, only one empty, mid-migration inference), migration cancellation detection logic, and rollback flows.

Adds e2e tests to verify field behavior during migrations.

Notes for the reviewer:

Requires companion PR openshift/machine-api-operator#1442 for API definition vendoring. This PR description was generated with AI assistance.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@RadekManak RadekManak changed the title OCPCLOUD-2998: Draft: implement synchronizedAPI OCPCLOUD-2998: implement synchronizedAPI Dec 5, 2025
@RadekManak RadekManak marked this pull request as ready for review December 5, 2025 15:49
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 5, 2025
@openshift-ci openshift-ci bot requested review from damdo and mdbooth December 5, 2025 15:49
@openshift-ci-robot
Copy link

openshift-ci-robot commented Dec 5, 2025

@RadekManak: This pull request references OCPCLOUD-2998 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set.

Details

In response to this:

Why:

Adds .status.synchronizedAPI field to Machine and MachineSet resources to enable reliable migration cancellation when migrations get stuck at status.authoritativeAPI: Migrating. Without this field, the system cannot determine which API was the migration source when users revert spec.authoritativeAPI, preventing proper rollback to the last known good state. Implements OCPCLOUD-2998.

What:

  • Adds .status.synchronizedAPI field (values: "" | MachineAPI | ClusterAPI) to track the last successfully synchronized API
  • Implements handleMigrationStatusInitialization() in migration controllers to bootstrap empty status fields with proper inference logic
  • Adds IsMigrationCancellationRequested() detection when spec.authoritativeAPI matches status.synchronizedAPI while status.authoritativeAPI == Migrating
  • Updates ApplyMigrationStatus() helpers to atomically set both authoritativeAPI and synchronizedAPI during state transitions

How can it be used:

Administrators can cancel stuck migrations by reverting spec.authoritativeAPI back to the previously synchronized state:

# Migration stuck in progress
status:
 authoritativeAPI: Migrating
 synchronizedAPI: MachineAPI  # Last good state

# Cancel by reverting spec
spec:
 authoritativeAPI: MachineAPI  # Matches synchronizedAPI

# System detects cancellation and rolls back
status:
 authoritativeAPI: MachineAPI
 synchronizedAPI: MachineAPI

The migration controller detects this pattern and transitions back to the synchronized state without requiring manual intervention.

How did you test it:

Unit tests cover status initialization scenarios (both fields empty, only one empty, mid-migration inference), migration cancellation detection logic, and rollback flows.

Adds e2e tests to verify field behavior during migrations.

Notes for the reviewer:

Requires companion PR openshift/machine-api-operator#1442 for API definition vendoring. This PR description was generated with AI assistance.

Summary by CodeRabbit

  • New Features

  • Added migration cancellation capability to rollback in-progress API migrations to the previous authoritative state

  • Enhanced tracking of API synchronization status during migration operations

  • Tests

  • Added comprehensive test coverage for migration cancellation and rollback scenarios

  • Expanded synchronization verification tests across multiple migration contexts

  • Chores

  • Updated dependencies

✏️ Tip: You can customize this high-level summary in your review settings.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
e2e/machine_migration_mapi_authoritative_test.go (1)

207-287: Excellent rollback test coverage!

The new "Machine Migration Rollback Tests" comprehensively test the cancellation workflow:

  1. Initial state verification
  2. Rollback from Migrating state back to MachineAPI
  3. Successful migration after a previous rollback
  4. Cleanup verification

One structural note: This Describe block is nested inside the "Machine Migration Round Trip Tests" Describe (line 136). While Ginkgo supports this, it may be cleaner to place this as a sibling Describe block rather than a child, for better test organization. However, this is a minor style preference and doesn't affect test execution.

Consider moving this Describe block to be a sibling of "Machine Migration Round Trip Tests" rather than nested inside it:

-	var _ = Describe("Machine Migration Round Trip Tests", Ordered, func() {
-		// ... existing round trip tests ...
-
-		var _ = Describe("Machine Migration Rollback Tests", Ordered, func() {
+	var _ = Describe("Machine Migration Round Trip Tests", Ordered, func() {
+		// ... existing round trip tests ...
+	})
+
+	var _ = Describe("Machine Migration Rollback Tests", Ordered, func() {
pkg/controllers/synccommon/migratestatus.go (1)

129-142: Consider handling unknown authority values explicitly.

The function correctly maps MachineAPI and ClusterAPI to their synchronized counterparts and returns empty string for Migrating. However, the default case silently returns empty string for any unknown values.

Consider whether logging a warning for unknown values would aid debugging, or if the empty string fallback is intentional for forward compatibility.

 	case mapiv1beta1.MachineAuthorityMigrating:
 		return ""
 	default:
+		// Unknown authority values return empty string.
+		// This provides forward compatibility if new authority values are added.
 		return ""
 	}
pkg/controllers/machinemigration/machine_migration_controller.go (1)

230-277: Consider extracting shared initialization logic to reduce duplication.

The handleMigrationStatusInitialization function is identical to the one in MachineSetMigrationReconciler. While the duplication is understandable given the different resource types and apply configurations, consider whether a shared helper could be created in synccommon package using generics, similar to the existing ApplyMigrationStatus pattern.

This is a minor refactor suggestion for future maintainability - not blocking.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between 20a3c13 and 7488790.

⛔ Files ignored due to path filters (83)
  • e2e/go.sum is excluded by !**/*.sum
  • go.sum is excluded by !**/*.sum
  • vendor/github.com/openshift/api/config/v1/register.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/config/v1/types_infrastructure.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/config/v1/types_insights.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/config/v1/types_node.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/config/v1/types_scheduling.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/config/v1/zz_generated.crd-manifests/0000_00_cluster-version-operator_01_clusterversions-Default.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/config/v1/zz_generated.crd-manifests/0000_10_config-operator_01_images-Default.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/config/v1/zz_generated.crd-manifests/0000_10_config-operator_01_images-DevPreviewNoUpgrade.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/config/v1/zz_generated.crd-manifests/0000_10_config-operator_01_images-TechPreviewNoUpgrade.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/config/v1/zz_generated.crd-manifests/0000_10_config-operator_01_images.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/config/v1/zz_generated.crd-manifests/0000_10_config-operator_01_infrastructures-CustomNoUpgrade.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/config/v1/zz_generated.crd-manifests/0000_10_config-operator_01_infrastructures-DevPreviewNoUpgrade.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/config/v1/zz_generated.crd-manifests/0000_10_config-operator_01_infrastructures-TechPreviewNoUpgrade.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/config/v1/zz_generated.crd-manifests/0000_10_config-operator_01_insightsdatagathers-CustomNoUpgrade.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/config/v1/zz_generated.crd-manifests/0000_10_config-operator_01_insightsdatagathers-DevPreviewNoUpgrade.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/config/v1/zz_generated.crd-manifests/0000_10_config-operator_01_insightsdatagathers-TechPreviewNoUpgrade.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/config/v1/zz_generated.crd-manifests/0000_10_config-operator_01_schedulers-Hypershift.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/config/v1/zz_generated.crd-manifests/0000_10_config-operator_01_schedulers-SelfManagedHA-CustomNoUpgrade.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/config/v1/zz_generated.crd-manifests/0000_10_config-operator_01_schedulers-SelfManagedHA-Default.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/config/v1/zz_generated.crd-manifests/0000_10_config-operator_01_schedulers-SelfManagedHA-DevPreviewNoUpgrade.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/config/v1/zz_generated.crd-manifests/0000_10_config-operator_01_schedulers-SelfManagedHA-TechPreviewNoUpgrade.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/config/v1/zz_generated.deepcopy.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/config/v1/zz_generated.featuregated-crd-manifests.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/config/v1/zz_generated.swagger_doc_generated.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/config/v1alpha1/types_cluster_monitoring.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/config/v1alpha1/zz_generated.featuregated-crd-manifests.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/console/v1/types.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/console/v1/zz_generated.swagger_doc_generated.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/features.md is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/features/features.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/machine/v1beta1/types_awsprovider.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/machine/v1beta1/types_machine.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/machine/v1beta1/types_machineset.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/machine/v1beta1/zz_generated.crd-manifests/0000_10_machine-api_01_machines-CustomNoUpgrade.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/machine/v1beta1/zz_generated.crd-manifests/0000_10_machine-api_01_machines-DevPreviewNoUpgrade.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/machine/v1beta1/zz_generated.crd-manifests/0000_10_machine-api_01_machines-TechPreviewNoUpgrade.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/machine/v1beta1/zz_generated.crd-manifests/0000_10_machine-api_01_machinesets-CustomNoUpgrade.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/machine/v1beta1/zz_generated.crd-manifests/0000_10_machine-api_01_machinesets-DevPreviewNoUpgrade.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/machine/v1beta1/zz_generated.crd-manifests/0000_10_machine-api_01_machinesets-TechPreviewNoUpgrade.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/machine/v1beta1/zz_generated.deepcopy.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/machine/v1beta1/zz_generated.swagger_doc_generated.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/openapi/generated_openapi/zz_generated.openapi.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/operator/v1/types_ingress.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/operator/v1/zz_generated.crd-manifests/0000_50_ingress_00_ingresscontrollers.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/operator/v1/zz_generated.deepcopy.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/operator/v1/zz_generated.swagger_doc_generated.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/client-go/config/applyconfigurations/config/v1/awsplatformstatus.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/client-go/config/applyconfigurations/config/v1/azureplatformstatus.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/client-go/config/applyconfigurations/config/v1/baremetalplatformstatus.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/client-go/config/applyconfigurations/config/v1/custom.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/client-go/config/applyconfigurations/config/v1/gatherconfig.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/client-go/config/applyconfigurations/config/v1/gathererconfig.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/client-go/config/applyconfigurations/config/v1/gatherers.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/client-go/config/applyconfigurations/config/v1/gcpplatformstatus.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/client-go/config/applyconfigurations/config/v1/gcpserviceendpoint.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/client-go/config/applyconfigurations/config/v1/insightsdatagather.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/client-go/config/applyconfigurations/config/v1/insightsdatagatherspec.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/client-go/config/applyconfigurations/config/v1/nutanixplatformstatus.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/client-go/config/applyconfigurations/config/v1/openstackplatformstatus.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/client-go/config/applyconfigurations/config/v1/ovirtplatformstatus.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/client-go/config/applyconfigurations/config/v1/persistentvolumeclaimreference.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/client-go/config/applyconfigurations/config/v1/persistentvolumeconfig.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/client-go/config/applyconfigurations/config/v1/storage.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/client-go/config/applyconfigurations/config/v1/vsphereplatformstatus.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/client-go/config/applyconfigurations/internal/internal.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/client-go/config/clientset/versioned/typed/config/v1/config_client.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/client-go/config/clientset/versioned/typed/config/v1/generated_expansion.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/client-go/config/clientset/versioned/typed/config/v1/insightsdatagather.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/client-go/config/informers/externalversions/config/v1/insightsdatagather.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/client-go/config/informers/externalversions/config/v1/interface.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/client-go/config/informers/externalversions/generic.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/client-go/config/listers/config/v1/expansion_generated.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/client-go/config/listers/config/v1/insightsdatagather.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/client-go/machine/applyconfigurations/machine/v1beta1/machinesetstatus.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/client-go/machine/applyconfigurations/machine/v1beta1/machinestatus.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/client-go/operator/applyconfigurations/internal/internal.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/client-go/operator/applyconfigurations/operator/v1/ingresscontrollerspec.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/client-go/operator/applyconfigurations/operator/v1/ingresscontrollertuningoptions.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/cluster-api-actuator-pkg/testutils/resourcebuilder/machine/v1beta1/machine.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/cluster-api-actuator-pkg/testutils/resourcebuilder/machine/v1beta1/machineset.go is excluded by !**/vendor/**, !vendor/**
  • vendor/modules.txt is excluded by !**/vendor/**, !vendor/**
📒 Files selected for processing (19)
  • e2e/go.mod (1 hunks)
  • e2e/machine_migration_capi_authoritative_test.go (3 hunks)
  • e2e/machine_migration_helpers.go (2 hunks)
  • e2e/machine_migration_mapi_authoritative_test.go (4 hunks)
  • e2e/machineset_migration_capi_authoritative_test.go (2 hunks)
  • e2e/machineset_migration_helpers.go (1 hunks)
  • e2e/machineset_migration_mapi_authoritative_test.go (3 hunks)
  • go.mod (2 hunks)
  • pkg/controllers/machinemigration/machine_migration_controller.go (4 hunks)
  • pkg/controllers/machinemigration/machine_migration_controller_test.go (16 hunks)
  • pkg/controllers/machinesetmigration/machineset_migration_controller.go (4 hunks)
  • pkg/controllers/machinesetmigration/machineset_migration_controller_test.go (16 hunks)
  • pkg/controllers/machinesetsync/machineset_sync_controller.go (1 hunks)
  • pkg/controllers/machinesync/machine_sync_controller.go (1 hunks)
  • pkg/controllers/synccommon/applyconfiguration.go (1 hunks)
  • pkg/controllers/synccommon/migratestatus.go (2 hunks)
  • pkg/controllers/synccommon/migratestatus_test.go (1 hunks)
  • pkg/controllers/synccommon/suite_test.go (1 hunks)
  • pkg/conversion/test/fuzz/fuzz.go (2 hunks)
🧰 Additional context used
🧬 Code graph analysis (7)
pkg/controllers/synccommon/migratestatus_test.go (1)
pkg/controllers/synccommon/migratestatus.go (1)
  • IsMigrationCancellationRequested (121-127)
pkg/controllers/machinesetmigration/machineset_migration_controller_test.go (2)
e2e/migration_common.go (1)
  • SynchronizedCondition (10-10)
pkg/controllers/common_consts.go (1)
  • SynchronizedCondition (41-41)
e2e/machine_migration_mapi_authoritative_test.go (3)
pkg/conversion/mapi2capi/interface.go (1)
  • Machine (24-26)
e2e/framework/machine.go (2)
  • GetMachine (75-86)
  • DeleteMachines (89-119)
e2e/framework/framework.go (1)
  • CAPINamespace (14-14)
pkg/controllers/machinemigration/machine_migration_controller.go (1)
pkg/controllers/synccommon/migratestatus.go (4)
  • IsMigrationCancellationRequested (121-127)
  • AuthoritativeAPIToSynchronizedAPI (131-142)
  • ApplyMigrationStatus (63-79)
  • ApplyMigrationStatusAndResetSyncStatus (42-60)
e2e/machine_migration_helpers.go (1)
e2e/framework/framework.go (2)
  • WaitMedium (24-24)
  • RetryMedium (18-18)
pkg/controllers/machinesetmigration/machineset_migration_controller.go (2)
pkg/controllers/synccommon/migratestatus.go (4)
  • IsMigrationCancellationRequested (121-127)
  • ApplyMigrationStatus (63-79)
  • AuthoritativeAPIToSynchronizedAPI (131-142)
  • ApplyMigrationStatusAndResetSyncStatus (42-60)
pkg/conversion/mapi2capi/interface.go (1)
  • MachineSet (29-31)
e2e/machineset_migration_helpers.go (2)
pkg/conversion/mapi2capi/interface.go (1)
  • MachineSet (29-31)
e2e/framework/framework.go (2)
  • WaitMedium (24-24)
  • RetryMedium (18-18)
🔇 Additional comments (58)
pkg/conversion/test/fuzz/fuzz.go (2)

736-736: LGTM! Correct handling of MAPI-only field in roundtrip testing.

The change properly clears the SynchronizedAPI field during fuzzing to ensure roundtrip conversion tests pass, since this field has no CAPI equivalent and would be lost during MAPI→CAPI→MAPI conversion. The implementation follows the established pattern for other MAPI-only fields like AuthoritativeAPI and SynchronizedGeneration.


783-783: LGTM! Consistent handling across MachineSet fuzzing.

The change correctly clears the SynchronizedAPI field for MachineSet status, mirroring the implementation for Machine status (line 736). This ensures consistent behavior across both resource types during roundtrip conversion testing.

e2e/go.mod (1)

19-19: LGTM! Dependency version aligned with root module.

The openshift/api version is correctly aligned with the root go.mod (line 26), ensuring consistency across the e2e test module and main module.

pkg/controllers/synccommon/applyconfiguration.go (1)

44-44: LGTM! Interface extension follows established pattern.

The new WithSynchronizedAPI method follows the same design pattern as the existing interface methods (WithConditions, WithSynchronizedGeneration, WithAuthoritativeAPI), maintaining consistency in the fluent API design.

pkg/controllers/machinesync/machine_sync_controller.go (1)

1547-1547: LGTM! Proper status field preservation.

The SynchronizedAPI field is correctly preserved from the existing machine status, following the same pattern as AuthoritativeAPI (line 1545) and SynchronizedGeneration (line 1546). This ensures the synchronization state is maintained during status updates.

pkg/controllers/synccommon/migratestatus_test.go (1)

26-74: LGTM! Comprehensive test coverage for migration cancellation logic.

The test table covers all the key scenarios for detecting migration cancellation:

  • Both directions of cancellation (ClusterAPI → MachineAPI and vice versa)
  • In-progress migrations that should NOT trigger cancellation
  • Pre-migration states that should NOT trigger cancellation

The test structure follows Ginkgo best practices with descriptive entry names and clear expectations.

pkg/controllers/synccommon/suite_test.go (1)

26-29: LGTM! Standard test suite setup.

The test suite follows the standard Ginkgo/Gomega pattern for test registration, correctly setting up the fail handler and naming the suite "SyncCommon Suite".

e2e/machineset_migration_helpers.go (1)

221-228: LGTM! Helper function follows established patterns.

The new verifyMachineSetSynchronizedAPI helper is well-designed:

  • Mirrors the structure of verifyMachineSetAuthoritative (lines 95-101)
  • Uses appropriate timeouts (WaitMedium, RetryMedium) consistent with similar assertions in the file
  • Provides clear assertion messages for test failures
e2e/machineset_migration_capi_authoritative_test.go (2)

169-169: LGTM! Appropriate synchronization verification after authority switch.

The addition of verifyMachineSetSynchronizedAPI correctly validates that after switching the MachineSet authority to MachineAPI (line 165), the synchronization status reflects MachineAPISynchronized. This aligns with the PR's objective of tracking the last successfully synchronized API.


208-208: LGTM! Complete test coverage for bidirectional migration.

The verification at line 208 properly validates the synchronization state after switching back to ClusterAPI authority (line 204), ensuring ClusterAPISynchronized is set. Together with the verification at line 169, this provides complete coverage for both migration directions.

go.mod (1)

5-11: Ensure temporary replace directives are tracked for removal.

The TODO comment and temporary replace directives in go.mod (lines 5-11) reference changes that should be reverted when companion PRs are merged. Verify that removal of these replacements is tracked via linked GitHub issues, PR dependencies, or a dedicated tracking mechanism beyond the inline TODO comment, to ensure they don't persist longer than necessary. Consider updating the TODO with specific issue numbers or PR links if not already linked.

e2e/machine_migration_helpers.go (2)

163-171: LGTM! Well-structured helper function.

The verifyMachineMigrating helper correctly uses SatisfyAll to verify both the Migrating state and the expected SynchronizedAPI value in a single assertion, following the established patterns in this file.


337-345: LGTM! Consistent implementation.

The verifyMachineSynchronizedAPI helper follows the same pattern as verifyMachineAuthoritative and other verification helpers in the file, using Eventually with komega.Object and appropriate timeouts.

e2e/machine_migration_capi_authoritative_test.go (3)

201-201: LGTM! Appropriate verification placement.

Adding verifyMachineSynchronizedAPI after verifyMachineSynchronizedGeneration provides complete coverage of the synchronization state, ensuring both generation and API are correctly tracked.


213-213: LGTM!

Correctly verifies MachineAPISynchronized after switching authority to MachineAPI.


225-225: LGTM!

Correctly verifies ClusterAPISynchronized after switching back to ClusterAPI.

pkg/controllers/machinesetsync/machineset_sync_controller.go (1)

1101-1104: LGTM! Correct status field preservation.

Preserving SynchronizedAPI alongside SynchronizedGeneration and AuthoritativeAPI ensures consistency during CAPI-to-MAPI synchronization, with these fields being managed separately via applySynchronizedConditionWithPatch.

e2e/machineset_migration_mapi_authoritative_test.go (3)

169-170: LGTM! Comprehensive verification added.

Adding both verifyMachineSetAuthoritative and verifyMachineSetSynchronizedAPI ensures complete state verification after the authority switch.


209-210: LGTM!

Correctly verifies MachineAPI authority and MachineAPISynchronized after switching back.


338-339: LGTM!

Consistent verification pattern for the update context after switching to ClusterAPI.

pkg/controllers/machinemigration/machine_migration_controller_test.go (5)

20-20: LGTM! Good addition for debugging.

Adding cmp package enables clearer diff output in test failure messages, improving debugging experience.


220-226: Good improvement to test assertions.

Using cmp.Diff in the failure message provides detailed comparison output when the test fails, making it easier to identify unexpected changes to the machine object.


777-903: Comprehensive migration cancellation test coverage.

The three cancellation contexts cover the essential scenarios:

  1. Cancelling back to MachineAPI
  2. Cancelling back to ClusterAPI
  3. Cancelling back to ClusterAPI with paused CAPI resources (verifying unpause behavior)

The tests correctly simulate stuck migration states and verify proper status transitions.


905-937: Good edge case coverage for status initialization.

This test verifies the handleMigrationStatusInitialization logic where SynchronizedAPI is empty but AuthoritativeAPI is set, ensuring backward compatibility and proper bootstrapping.


939-977: Important test for SynchronizedAPI preservation.

This test verifies that when transitioning from a stable state (ClusterAPI) to Migrating, the SynchronizedAPI is preserved to enable rollback detection. This is critical for the cancellation mechanism to work correctly.

e2e/machine_migration_mapi_authoritative_test.go (3)

166-166: LGTM!

Correctly verifies MachineAPISynchronized after initial synchronization.


178-178: LGTM!

Correctly verifies ClusterAPISynchronized after switching to ClusterAPI.


190-190: LGTM!

Correctly verifies MachineAPISynchronized after switching back to MachineAPI.

pkg/controllers/machinesetmigration/machineset_migration_controller_test.go (17)

200-211: LGTM - Test setup properly initializes SynchronizedAPI.

The test correctly sets both AuthoritativeAPI and SynchronizedAPI to MachineAPI in the status, matching the expected synchronized state for this scenario.


247-249: LGTM - SynchronizedAPI initialization added to migration request test.

Correctly sets SynchronizedAPI to MachineAPISynchronized when starting a migration from MachineAPI to ClusterAPI.


288-290: LGTM - Proper SynchronizedAPI tracking during MachineAPI→ClusterAPI migration.

Test correctly maintains MachineAPISynchronized as the synchronized state while in Migrating status.


323-330: LGTM - ClusterAPI→MachineAPI migration uses correct synchronized state.

Test properly uses WithSynchronizedAPIStatus(mapiv1beta1.ClusterAPISynchronized) and sets it on the status, reflecting that ClusterAPI was the last synchronized source.


374-391: LGTM - Synchronized state properly tracked during pausing phase.

Test correctly sets SynchronizedAPI to MachineAPISynchronized when testing the pausing flow during MachineAPI→ClusterAPI migration.


430-438: LGTM - ClusterAPI→MachineAPI pause flow correctly sets synchronized state.

Test properly uses ClusterAPISynchronized as the synchronized state when migrating from ClusterAPI to MachineAPI.


496-506: LGTM - Synchronized state properly set for generation mismatch test.

Test correctly uses MachineAPISynchronized when testing the scenario where synchronizedGeneration doesn't match.


537-547: LGTM - ClusterAPI generation mismatch test properly configured.

Test correctly sets ClusterAPISynchronized for the ClusterAPI→MachineAPI generation mismatch scenario.


579-597: LGTM - Complete migration prerequisites test properly configured.

Test correctly sets MachineAPISynchronized for the MachineAPI→ClusterAPI completion scenario with all prerequisites satisfied.


630-634: Verify assertion uses correct SynchronizedAPI value after migration completion.

The test asserts SynchronizedAPI equals ClusterAPISynchronized after migration completes from MachineAPI to ClusterAPI. This is correct since the new authority (ClusterAPI) should also be the new synchronized state.


660-678: LGTM - ClusterAPI→MachineAPI completion test properly configured.

Test correctly sets ClusterAPISynchronized as the starting synchronized state for migration from ClusterAPI to MachineAPI.


702-707: LGTM - Correct assertion for ClusterAPI→MachineAPI migration completion.

Test properly asserts MachineAPISynchronized as the final synchronized state after completing migration to MachineAPI.


711-742: LGTM - Migration cancellation back to MachineAPI test is well-structured.

The test correctly simulates a stuck migration (status Migrating, synchronized to MachineAPI) and verifies that when spec.AuthoritativeAPI matches the synchronized state, the controller detects cancellation and transitions back.


745-776: LGTM - Migration cancellation back to ClusterAPI test is comprehensive.

Test properly verifies rollback to ClusterAPI when spec.AuthoritativeAPI equals ClusterAPI and SynchronizedAPI is ClusterAPISynchronized.


778-820: LGTM - Migration cancellation with paused CAPI resource is well-tested.

Test verifies that when cancelling back to ClusterAPI, the paused CAPI resource gets unpaused. This is important for ensuring the rollback target becomes operational.


823-853: LGTM - Empty SynchronizedAPI initialization test covers important edge case.

Test verifies that when AuthoritativeAPI is set but SynchronizedAPI is empty, the reconciler initializes SynchronizedAPI from AuthoritativeAPI. This handles upgrade scenarios from older versions.


855-891: LGTM - Transition to Migrating preserves SynchronizedAPI correctly.

Test verifies that when transitioning from a stable state (ClusterAPI) to Migrating, the SynchronizedAPI is preserved as ClusterAPISynchronized. This is essential for enabling future cancellation/rollback.

pkg/controllers/synccommon/migratestatus.go (4)

33-60: LGTM - ApplyMigrationStatusAndResetSyncStatus correctly extended to include SynchronizedAPI.

The function now atomically sets both AuthoritativeAPI and SynchronizedAPI while resetting sync status. The call to statusAC.WithSynchronizedAPI(synchronizedAPI) at line 57 ensures the synchronized state is properly included in the patch.


62-79: LGTM - ApplyMigrationStatus correctly extended to include SynchronizedAPI.

The function now properly sets both AuthoritativeAPI and SynchronizedAPI in a single patch operation, ensuring atomicity of the status update.


100-108: LGTM - Comments clarify field ownership management.

The updated comments correctly explain the field ownership semantics and the validation rule requiring synchronizedGeneration reset when changing authoritativeAPI.


116-127: LGTM - IsMigrationCancellationRequested correctly detects rollback intent.

The function properly detects when a user wants to cancel a migration by checking:

  1. Status is Migrating
  2. Spec's AuthoritativeAPI (converted to SynchronizedAPI) matches the current SynchronizedAPI

This correctly identifies when spec.authoritativeAPI has been reverted to the last synchronized state.

pkg/controllers/machinesetmigration/machineset_migration_controller.go (5)

130-158: LGTM - Migration status initialization and cancellation handling are well-implemented.

The code correctly:

  1. Delegates to handleMigrationStatusInitialization for empty status fields
  2. Detects migration cancellation using IsMigrationCancellationRequested
  3. Ensures the rollback target is unpaused before applying the status patch
  4. Provides clear logging for cancellation scenarios

178-182: LGTM - Transition to Migrating correctly captures the source as SynchronizedAPI.

Before entering the Migrating state, the current AuthoritativeAPI is converted to SynchronizedAPI and both are patched atomically. This ensures the source of truth is preserved for potential rollback.


215-222: LGTM - Migration completion correctly updates both AuthoritativeAPI and SynchronizedAPI.

The new synchronized state is derived from the target authority, and both fields are updated atomically while resetting the sync status. The enhanced logging provides useful debugging information.


228-275: Consider potential edge case in SynchronizedAPI inference logic.

The handleMigrationStatusInitialization function handles three scenarios well. However, in the third case (lines 251-271), when SynchronizedAPI is empty but AuthoritativeAPI is set:

The logic assumes we're in a forward migration and infers the source from the opposite of spec. This could be incorrect if the controller restarts mid-cancellation. However, given this is a transitional state for upgrades from older versions, the assumption is reasonable.

Consider adding a comment explaining this assumption:

 	if mapiMachineSet.Status.SynchronizedAPI == "" {
 		// We are in a migration (Status.AuthoritativeAPI is Migrating) but we don't have SynchronizedAPI.
 		// Assuming this is a standard forward migration (not a cancellation), the Spec tells us the Target.
 		// Therefore, the Source (SynchronizedAPI) must be the opposite of the Spec.
+		// Note: This heuristic is for upgrade compatibility from older versions without SynchronizedAPI.
+		// In edge cases (e.g., controller restart mid-cancellation), this may not be perfectly accurate,
+		// but the worst case is a failed cancellation that requires re-initiating migration.
 		targetAPI := mapiMachineSet.Spec.AuthoritativeAPI

423-431: LGTM - Wrapper functions improve code readability.

The applyMigrationStatusWithPatch and applyMigrationStatusAndResetSyncStatusWithPatch wrappers encapsulate the generic function calls, making the reconciler code cleaner and more maintainable.

pkg/controllers/machinemigration/machine_migration_controller.go (4)

130-157: LGTM - Migration status initialization and cancellation handling consistent with MachineSet controller.

The implementation correctly mirrors the MachineSet controller logic:

  1. Delegates to handleMigrationStatusInitialization
  2. Detects cancellation with IsMigrationCancellationRequested
  3. Unpauses rollback target before patching
  4. Provides clear logging

177-181: LGTM - Transition to Migrating correctly captures synchronizedAPI.

Identical pattern to MachineSet controller - captures current authority as synchronized state before transitioning to Migrating.


217-224: LGTM - Migration completion correctly updates status fields.

The implementation properly derives newSynchronizedAPI from the target authority and updates both fields atomically with enhanced logging.


462-470: LGTM - Wrapper functions properly encapsulate status patching.

The wrapper functions correctly use MachineStatusApplyConfiguration type parameter and call the shared synccommon functions.

@openshift-ci-robot
Copy link

openshift-ci-robot commented Jan 7, 2026

@RadekManak: This pull request references OCPCLOUD-2998 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Why:

Adds .status.synchronizedAPI field to Machine and MachineSet resources to enable reliable migration cancellation when migrations get stuck at status.authoritativeAPI: Migrating. Without this field, the system cannot determine which API was the migration source when users revert spec.authoritativeAPI, preventing proper rollback to the last known good state. Implements OCPCLOUD-2998.

What:

  • Adds .status.synchronizedAPI field (values: "" | MachineAPI | ClusterAPI) to track the last successfully synchronized API
  • Implements handleMigrationStatusInitialization() in migration controllers to bootstrap empty status fields with proper inference logic
  • Adds IsMigrationCancellationRequested() detection when spec.authoritativeAPI matches status.synchronizedAPI while status.authoritativeAPI == Migrating
  • Updates ApplyMigrationStatus() helpers to atomically set both authoritativeAPI and synchronizedAPI during state transitions

How can it be used:

Administrators can cancel stuck migrations by reverting spec.authoritativeAPI back to the previously synchronized state:

# Migration stuck in progress
status:
 authoritativeAPI: Migrating
 synchronizedAPI: MachineAPI  # Last good state

# Cancel by reverting spec
spec:
 authoritativeAPI: MachineAPI  # Matches synchronizedAPI

# System detects cancellation and rolls back
status:
 authoritativeAPI: MachineAPI
 synchronizedAPI: MachineAPI

The migration controller detects this pattern and transitions back to the synchronized state without requiring manual intervention.

How did you test it:

Unit tests cover status initialization scenarios (both fields empty, only one empty, mid-migration inference), migration cancellation detection logic, and rollback flows.

Adds e2e tests to verify field behavior during migrations.

Notes for the reviewer:

Requires companion PR openshift/machine-api-operator#1442 for API definition vendoring. This PR description was generated with AI assistance.

Summary by CodeRabbit

Release Notes

  • New Features

  • Added support for cancelling in-progress machine and machineset migrations, reverting to the previous authoritative API.

  • Enhanced API synchronization tracking to explicitly verify which API a resource is synchronized with throughout migration workflows.

  • Refactor

  • Improved status management for better handling of authoritative and synchronized API fields during migrations.

  • Tests

  • Expanded test coverage for migration cancellation scenarios and API synchronization verification.

✏️ Tip: You can customize this high-level summary in your review settings.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In @e2e/machine_migration_mapi_authoritative_test.go:
- Around line 208-215: There’s a duplicated/overlapping test block: the outer
Context("Machine Migration Rollback Tests", Ordered, func() { with vars
machineRollbackName, newMapiMachine, newCapiMachine is immediately followed by a
Describe("Machine Migration Rollback Tests", Ordered, func() { and a repeated
set of the same variable declarations; remove the structural duplication by
keeping only one block (either the Context or the Describe) and the single set
of declarations (machineRollbackName, newMapiMachine, newCapiMachine), or if
both are intended, properly close the Context before starting the Describe;
ensure only one declaration of those variables remains to fix the
compilation/runtime error.

In @pkg/controllers/machinemigration/machine_migration_controller_test.go:
- Around line 852-863: Replace the incorrect clusterv1beta1.PausedAnnotation
usages with clusterv1.PausedAnnotation in the machine_migration_controller_test
to match the v1beta2 CAPI Machine type; find occurrences of
clusterv1beta1.PausedAnnotation (used when building capiMachine and capaMachine
via capiMachineBuilder/capaMachineBuilder) and swap them to
clusterv1.PausedAnnotation so the test annotations match the controller and
other tests that use clusterv1.PausedAnnotation.
🧹 Nitpick comments (2)
pkg/controllers/synccommon/migratestatus_test.go (1)

26-74: Solid unit test coverage for IsMigrationCancellationRequested.

The test table covers the key scenarios:

  • ✓ Cancellation requests (spec reverted to match synchronized state while migrating)
  • ✓ Normal migrations in progress (spec differs from synchronized state)
  • ✓ Migration requests not yet acknowledged

Consider adding edge case entries for empty SynchronizedAPI or when spec.AuthoritativeAPI is Migrating (though schema validation prevents this).

pkg/controllers/machinesetmigration/machineset_migration_controller.go (1)

149-153: Inconsistent method usage for status patching.

Line 151 directly calls synccommon.ApplyMigrationStatus[...] instead of using the wrapper method r.applyMigrationStatusWithPatch. Other status patches in this file (e.g., lines 181, 236, 245, 266) use the wrapper methods.

For consistency and maintainability, consider using the wrapper:

♻️ Suggested refactor
-		if err := synccommon.ApplyMigrationStatus[*machinev1applyconfigs.MachineSetStatusApplyConfiguration](ctx, r.Client, controllerName, machinev1applyconfigs.MachineSet, mapiMachineSet, mapiMachineSet.Spec.AuthoritativeAPI, mapiMachineSet.Status.SynchronizedAPI); err != nil {
+		if err := r.applyMigrationStatusWithPatch(ctx, mapiMachineSet, mapiMachineSet.Spec.AuthoritativeAPI, mapiMachineSet.Status.SynchronizedAPI); err != nil {
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between 7488790 and 1466b09.

⛔ Files ignored due to path filters (17)
  • go.work is excluded by !**/*.work
  • go.work.sum is excluded by !**/*.sum
  • vendor/github.com/openshift/api/machine/v1beta1/types_machine.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/machine/v1beta1/types_machineset.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/machine/v1beta1/zz_generated.crd-manifests/0000_10_machine-api_01_machines-CustomNoUpgrade.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/machine/v1beta1/zz_generated.crd-manifests/0000_10_machine-api_01_machines-DevPreviewNoUpgrade.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/machine/v1beta1/zz_generated.crd-manifests/0000_10_machine-api_01_machines-TechPreviewNoUpgrade.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/machine/v1beta1/zz_generated.crd-manifests/0000_10_machine-api_01_machinesets-CustomNoUpgrade.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/machine/v1beta1/zz_generated.crd-manifests/0000_10_machine-api_01_machinesets-DevPreviewNoUpgrade.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/machine/v1beta1/zz_generated.crd-manifests/0000_10_machine-api_01_machinesets-TechPreviewNoUpgrade.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/machine/v1beta1/zz_generated.swagger_doc_generated.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/openapi/generated_openapi/zz_generated.openapi.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/client-go/machine/applyconfigurations/machine/v1beta1/machinesetstatus.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/client-go/machine/applyconfigurations/machine/v1beta1/machinestatus.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/cluster-api-actuator-pkg/testutils/resourcebuilder/machine/v1beta1/machine.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/cluster-api-actuator-pkg/testutils/resourcebuilder/machine/v1beta1/machineset.go is excluded by !**/vendor/**, !vendor/**
  • vendor/modules.txt is excluded by !**/vendor/**, !vendor/**
📒 Files selected for processing (17)
  • e2e/machine_migration_capi_authoritative_test.go
  • e2e/machine_migration_helpers.go
  • e2e/machine_migration_mapi_authoritative_test.go
  • e2e/machineset_migration_capi_authoritative_test.go
  • e2e/machineset_migration_helpers.go
  • e2e/machineset_migration_mapi_authoritative_test.go
  • pkg/controllers/machinemigration/machine_migration_controller.go
  • pkg/controllers/machinemigration/machine_migration_controller_test.go
  • pkg/controllers/machinesetmigration/machineset_migration_controller.go
  • pkg/controllers/machinesetmigration/machineset_migration_controller_test.go
  • pkg/controllers/machinesetsync/machineset_sync_controller.go
  • pkg/controllers/machinesync/machine_sync_controller.go
  • pkg/controllers/synccommon/applyconfiguration.go
  • pkg/controllers/synccommon/migratestatus.go
  • pkg/controllers/synccommon/migratestatus_test.go
  • pkg/controllers/synccommon/suite_test.go
  • pkg/conversion/test/fuzz/fuzz.go
🚧 Files skipped from review as they are similar to previous changes (5)
  • pkg/conversion/test/fuzz/fuzz.go
  • e2e/machine_migration_capi_authoritative_test.go
  • e2e/machine_migration_helpers.go
  • pkg/controllers/machinesetsync/machineset_sync_controller.go
  • pkg/controllers/machinesync/machine_sync_controller.go
🧰 Additional context used
🧬 Code graph analysis (6)
pkg/controllers/synccommon/migratestatus.go (2)
pkg/controllers/common_consts.go (1)
  • ReasonAuthoritativeAPIChanged (48-48)
pkg/util/ssa.go (1)
  • ApplyConfigPatch (42-46)
pkg/controllers/synccommon/migratestatus_test.go (1)
pkg/controllers/synccommon/migratestatus.go (1)
  • IsMigrationCancellationRequested (121-127)
e2e/machineset_migration_helpers.go (2)
pkg/conversion/mapi2capi/interface.go (1)
  • MachineSet (29-31)
e2e/framework/framework.go (2)
  • WaitMedium (24-24)
  • RetryMedium (18-18)
pkg/controllers/machinesetmigration/machineset_migration_controller.go (2)
pkg/controllers/synccommon/migratestatus.go (4)
  • IsMigrationCancellationRequested (121-127)
  • ApplyMigrationStatus (63-79)
  • AuthoritativeAPIToSynchronizedAPI (131-142)
  • ApplyMigrationStatusAndResetSyncStatus (42-60)
pkg/conversion/mapi2capi/interface.go (1)
  • MachineSet (29-31)
pkg/controllers/machinesetmigration/machineset_migration_controller_test.go (2)
e2e/migration_common.go (1)
  • SynchronizedCondition (10-10)
pkg/controllers/common_consts.go (1)
  • SynchronizedCondition (41-41)
e2e/machine_migration_mapi_authoritative_test.go (2)
e2e/framework/machine.go (2)
  • GetMachine (75-86)
  • DeleteMachines (105-135)
e2e/framework/framework.go (1)
  • CAPINamespace (14-14)
🔇 Additional comments (22)
pkg/controllers/synccommon/suite_test.go (1)

1-29: LGTM!

Standard Ginkgo test suite scaffolding that enables the new IsMigrationCancellationRequested tests in migratestatus_test.go.

pkg/controllers/synccommon/applyconfiguration.go (1)

40-45: LGTM!

The interface extension follows the existing pattern and enables the migration status helpers to set SynchronizedAPI alongside AuthoritativeAPI during status patches.

pkg/controllers/machinemigration/machine_migration_controller_test.go (2)

772-897: Comprehensive migration cancellation test coverage added.

The new test contexts thoroughly cover:

  • Cancellation back to MachineAPI from stuck migration
  • Cancellation back to ClusterAPI from stuck migration
  • Cancellation with paused CAPI resources (verifying unpause behavior)

The assertions correctly verify both AuthoritativeAPI and SynchronizedAPI status fields after cancellation.


899-999: Status initialization edge cases well covered.

The tests properly cover:

  • Empty SynchronizedAPI with set AuthoritativeAPI (lines 899-931)
  • Empty AuthoritativeAPI with set SynchronizedAPI (lines 933-959)
  • Stable-to-Migrating transition preserving SynchronizedAPI (lines 961-999)

This aligns with the initialization logic in handleMigrationStatusInitialization.

pkg/controllers/synccommon/migratestatus.go (3)

116-127: Clean cancellation detection logic.

The IsMigrationCancellationRequested function correctly identifies the cancellation scenario: when status.AuthoritativeAPI is Migrating and spec.AuthoritativeAPI matches the status.SynchronizedAPI (via the conversion helper). This enables administrators to abort stuck migrations by reverting spec to the last known good state.


129-142: Mapping function handles all authority values correctly.

The AuthoritativeAPIToSynchronizedAPI helper appropriately:

  • Maps MachineAPIMachineAPISynchronized
  • Maps ClusterAPIClusterAPISynchronized
  • Returns empty string for Migrating and unknown values (safe default)

42-60: Updated status patching functions correctly propagate SynchronizedAPI.

The renamed ApplyMigrationStatusAndResetSyncStatus function now:

  1. Accepts the synchronizedAPI parameter
  2. Calls WithSynchronizedAPI on the status apply configuration before patching

This ensures atomic updates of both AuthoritativeAPI and SynchronizedAPI during migration state transitions.

pkg/controllers/machinemigration/machine_migration_controller.go (4)

138-159: Migration cancellation flow is well-structured.

The cancellation logic:

  1. Detects cancellation via IsMigrationCancellationRequested
  2. Ensures the rollback target is unpaused
  3. Applies status patch to transition back to synchronized state
  4. Logs success with both field values

This provides a clean escape hatch for stuck migrations.


464-472: Clean wrapper methods for status patching.

The new applyMigrationStatusWithPatch and applyMigrationStatusAndResetSyncStatusWithPatch methods properly encapsulate the generic function calls with controller-specific types, improving readability throughout the reconciler.


243-253: Implicit coupling based on string value assumptions.

Line 248 casts SynchronizedAPI to MachineAuthority:

mapiv1beta1.MachineAuthority(mapiMachine.Status.SynchronizedAPI)

This relies on MachineAPISynchronized and MachineAuthorityMachineAPI having identical underlying string values (and similarly for the ClusterAPI variants). While the forward conversion function AuthoritativeAPIToSynchronizedAPI documents this mapping explicitly, the reverse conversion at line 248 uses a direct type cast without validation or a helper function. This creates implicit coupling to external API package constants that could silently break if their values diverge.

Consider adding a reverse conversion helper (e.g., SynchronizedAPIToMachineAuthority) to make this dependency explicit and maintainable.


255-275: This concern is not valid—the scenario described cannot occur.

The code handles an initialization edge case where SynchronizedAPI is empty. However, whenever status.AuthoritativeAPI transitions to Migrating, the controller explicitly sets SynchronizedAPI at the same time (line 186-189). All test cases confirm this invariant: when status.AuthoritativeAPI is Migrating, status.SynchronizedAPI is always populated, never empty.

Therefore, the third code block (which handles empty SynchronizedAPI) cannot execute when status.AuthoritativeAPI is already Migrating. The inference logic is only used for recovery from inconsistent states and correctly reflects the actual last synchronized state.

pkg/controllers/machinesetmigration/machineset_migration_controller.go (2)

229-276: handleMigrationStatusInitialization mirrors Machine controller logic.

The initialization logic is identical to the Machine controller implementation, correctly handling:

  1. Both fields empty → initialize from spec
  2. Only AuthoritativeAPI empty → derive from SynchronizedAPI
  3. Only SynchronizedAPI empty → infer from migration direction

This duplication is acceptable given the different receiver types, though a shared implementation via generics could be considered in the future.


137-159: Migration cancellation flow correctly implemented for MachineSet.

The cancellation handling follows the same pattern as the Machine controller, ensuring consistent behavior across both resource types.

e2e/machineset_migration_capi_authoritative_test.go (1)

169-169: LGTM! Synchronization state verification is correctly integrated.

The additions of verifyMachineSetSynchronizedAPI calls after authority transitions properly validate that the synchronization tracking field reflects the expected state following MachineAPI ↔ ClusterAPI switches.

Also applies to: 208-208

e2e/machineset_migration_helpers.go (1)

221-228: LGTM! Well-implemented verification helper.

The new verifyMachineSetSynchronizedAPI function follows the established pattern used by other verification helpers in this file, with appropriate Eventually assertions and descriptive logging.

e2e/machineset_migration_mapi_authoritative_test.go (1)

169-170: LGTM! Consistent synchronization state verification.

The additions properly verify both authoritative status and synchronization state after authority switches, ensuring comprehensive validation of migration state transitions.

Also applies to: 208-209, 339-340

e2e/machine_migration_mapi_authoritative_test.go (2)

166-166: LGTM! Synchronization state verification enhances round-trip test coverage.

The additions of verifyMachineSynchronizedAPI calls throughout the round-trip test correctly validate that the synchronization state is tracked across MAPI → CAPI → MAPI transitions.

Also applies to: 178-178, 190-190


217-291: Excellent rollback test coverage.

The rollback test scenario comprehensively validates the migration cancellation workflow:

  • Initiates migration from MAPI to ClusterAPI
  • Cancels during Migrating state
  • Verifies rollback to MachineAPI with preserved synchronization state
  • Completes a subsequent successful migration to ClusterAPI

This provides valuable coverage for the administrator workflow described in the PR objectives.

pkg/controllers/machinesetmigration/machineset_migration_controller_test.go (4)

199-209: LGTM! Consistent SynchronizedAPI field integration across all test scenarios.

The updates consistently set and verify Status.SynchronizedAPI alongside Status.AuthoritativeAPI throughout the test suite, ensuring proper tracking of the synchronized state in all migration scenarios.

Also applies to: 246-246, 287-287, 322-329, 373-388, 429-436, 493-501, 534-542, 576-592, 627-627, 655-671, 697-697


704-814: Excellent migration cancellation test coverage.

The new test contexts comprehensively validate migration cancellation scenarios:

  • Cancellation back to MachineAPI from Migrating state
  • Cancellation back to ClusterAPI from Migrating state
  • Proper unpausing of target resources when cancelling

These tests ensure the cancellation detection logic (mentioned in PR objectives as IsMigrationCancellationRequested()) works correctly in both directions.


816-874: Comprehensive status initialization test coverage.

The initialization tests properly validate bootstrap logic for edge cases:

  • AuthoritativeAPI set but SynchronizedAPI empty
  • SynchronizedAPI set but AuthoritativeAPI empty

This coverage ensures the handleMigrationStatusInitialization() function (referenced in PR description) handles partial state correctly.


876-912: Good coverage for SynchronizedAPI preservation during state transitions.

The test validates that when transitioning from a stable state (ClusterAPI) to Migrating, the SynchronizedAPI field correctly preserves the source API, which is essential for the cancellation detection workflow.

@openshift-ci-robot
Copy link

openshift-ci-robot commented Jan 8, 2026

@RadekManak: This pull request references OCPCLOUD-2998 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Why:

Adds .status.synchronizedAPI field to Machine and MachineSet resources to enable reliable migration cancellation when migrations get stuck at status.authoritativeAPI: Migrating. Without this field, the system cannot determine which API was the migration source when users revert spec.authoritativeAPI, preventing proper rollback to the last known good state. Implements OCPCLOUD-2998.

What:

  • Adds .status.synchronizedAPI field (values: "" | MachineAPI | ClusterAPI) to track the last successfully synchronized API
  • Implements handleMigrationStatusInitialization() in migration controllers to bootstrap empty status fields with proper inference logic
  • Adds IsMigrationCancellationRequested() detection when spec.authoritativeAPI matches status.synchronizedAPI while status.authoritativeAPI == Migrating
  • Updates ApplyMigrationStatus() helpers to atomically set both authoritativeAPI and synchronizedAPI during state transitions

How can it be used:

Administrators can cancel stuck migrations by reverting spec.authoritativeAPI back to the previously synchronized state:

# Migration stuck in progress
status:
 authoritativeAPI: Migrating
 synchronizedAPI: MachineAPI  # Last good state

# Cancel by reverting spec
spec:
 authoritativeAPI: MachineAPI  # Matches synchronizedAPI

# System detects cancellation and rolls back
status:
 authoritativeAPI: MachineAPI
 synchronizedAPI: MachineAPI

The migration controller detects this pattern and transitions back to the synchronized state without requiring manual intervention.

How did you test it:

Unit tests cover status initialization scenarios (both fields empty, only one empty, mid-migration inference), migration cancellation detection logic, and rollback flows.

Adds e2e tests to verify field behavior during migrations.

Notes for the reviewer:

Requires companion PR openshift/machine-api-operator#1442 for API definition vendoring. This PR description was generated with AI assistance.

Summary by CodeRabbit

  • New Features

  • Cancel in-progress machine and machineset migrations to revert to the previously synchronized API.

  • Track and expose which API (MachineAPI or ClusterAPI) a resource is synchronized with during migrations.

  • Refactor

  • Centralized status initialization and atomic updates for authoritative and synchronized API fields to improve migration state consistency.

  • Tests

  • Broadened test coverage for migration flows, rollbacks, and explicit synchronized-API verifications.

✏️ Tip: You can customize this high-level summary in your review settings.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
pkg/controllers/machinemigration/machine_migration_controller_test.go (1)

899-959: LGTM: Good coverage of status initialization scenarios.

These tests handle important bootstrap cases where status fields are partially populated:

  • Lines 899-931: Initialize SynchronizedAPI when only AuthoritativeAPI is set
  • Lines 933-959: Derive AuthoritativeAPI when only SynchronizedAPI is set

Both scenarios correctly verify the controller fills in the missing field with the appropriate value.

💡 Optional: Consider verifying requeue behavior

The test descriptions mention "requeue" (line 922) but don't verify result.Requeue. While this appears to be a consistent pattern in the file (e.g., line 183), explicitly asserting the requeue behavior could make the tests more precise:

 It("should initialize SynchronizedAPI from AuthoritativeAPI and requeue", func() {
-    _, err := reconciler.Reconcile(ctx, req)
+    result, err := reconciler.Reconcile(ctx, req)
     Expect(err).NotTo(HaveOccurred())
+    Expect(result.Requeue).To(BeTrue(), "expected requeue after initialization")

     Eventually(k.Object(mapiMachine)).Should(SatisfyAll(

However, if the existing pattern is intentional (e.g., the requeue is implicit or tested via Eventually), this can be safely ignored.

pkg/controllers/machinesetmigration/machineset_migration_controller_test.go (1)

816-874: Test coverage is good but could be extended.

The tests verify initialization when one field is empty and can be derived from the other. However, consider adding test coverage for the more complex inference scenario in handleMigrationStatusInitialization where:

  • spec.AuthoritativeAPI != status.AuthoritativeAPI (migration requested)
  • status.AuthoritativeAPI is set but status.SynchronizedAPI is empty
  • The controller must infer the source API from the target

This would exercise the inference logic at lines 255-267 that I flagged in the controller review.

📝 Suggested additional test
Context("when migration is in progress but SynchronizedAPI is empty (inference scenario)", func() {
	BeforeEach(func() {
		By("Setting the MAPI machine set spec AuthoritativeAPI to ClusterAPI (migration target)")
		mapiMachineSet = mapiMachineSetBuilder.
			WithAuthoritativeAPI(mapiv1beta1.MachineAuthorityClusterAPI).
			Build()
		Eventually(k8sClient.Create(ctx, mapiMachineSet)).Should(Succeed())

		By("Creating mirror CAPI machine set")
		capiMachineSet = capiMachineSetBuilder.Build()
		Eventually(k8sClient.Create(ctx, capiMachineSet)).Should(Succeed())

		By("Setting AuthoritativeAPI to MachineAPI (source) but leaving SynchronizedAPI empty")
		Eventually(k.UpdateStatus(mapiMachineSet, func() {
			mapiMachineSet.Status.AuthoritativeAPI = mapiv1beta1.MachineAuthorityMachineAPI
			// SynchronizedAPI intentionally left empty - should be inferred as opposite of target
		})).Should(Succeed())

		req = reconcile.Request{NamespacedName: client.ObjectKeyFromObject(mapiMachineSet)}
	})

	It("should infer SynchronizedAPI as the opposite of the migration target", func() {
		_, err := reconciler.Reconcile(ctx, req)
		Expect(err).NotTo(HaveOccurred())

		Eventually(k.Object(mapiMachineSet)).Should(SatisfyAll(
			HaveField("Status.AuthoritativeAPI", Equal(mapiv1beta1.MachineAuthorityMachineAPI)),
			HaveField("Status.SynchronizedAPI", Equal(mapiv1beta1.MachineAPISynchronized)),
		))
	})
})
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between 1466b09 and f081583.

⛔ Files ignored due to path filters (17)
  • go.work is excluded by !**/*.work
  • go.work.sum is excluded by !**/*.sum
  • vendor/github.com/openshift/api/machine/v1beta1/types_machine.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/machine/v1beta1/types_machineset.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/machine/v1beta1/zz_generated.crd-manifests/0000_10_machine-api_01_machines-CustomNoUpgrade.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/machine/v1beta1/zz_generated.crd-manifests/0000_10_machine-api_01_machines-DevPreviewNoUpgrade.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/machine/v1beta1/zz_generated.crd-manifests/0000_10_machine-api_01_machines-TechPreviewNoUpgrade.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/machine/v1beta1/zz_generated.crd-manifests/0000_10_machine-api_01_machinesets-CustomNoUpgrade.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/machine/v1beta1/zz_generated.crd-manifests/0000_10_machine-api_01_machinesets-DevPreviewNoUpgrade.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/machine/v1beta1/zz_generated.crd-manifests/0000_10_machine-api_01_machinesets-TechPreviewNoUpgrade.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/machine/v1beta1/zz_generated.swagger_doc_generated.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/openapi/generated_openapi/zz_generated.openapi.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/client-go/machine/applyconfigurations/machine/v1beta1/machinesetstatus.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/client-go/machine/applyconfigurations/machine/v1beta1/machinestatus.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/cluster-api-actuator-pkg/testutils/resourcebuilder/machine/v1beta1/machine.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/cluster-api-actuator-pkg/testutils/resourcebuilder/machine/v1beta1/machineset.go is excluded by !**/vendor/**, !vendor/**
  • vendor/modules.txt is excluded by !**/vendor/**, !vendor/**
📒 Files selected for processing (17)
  • e2e/machine_migration_capi_authoritative_test.go
  • e2e/machine_migration_helpers.go
  • e2e/machine_migration_mapi_authoritative_test.go
  • e2e/machineset_migration_capi_authoritative_test.go
  • e2e/machineset_migration_helpers.go
  • e2e/machineset_migration_mapi_authoritative_test.go
  • pkg/controllers/machinemigration/machine_migration_controller.go
  • pkg/controllers/machinemigration/machine_migration_controller_test.go
  • pkg/controllers/machinesetmigration/machineset_migration_controller.go
  • pkg/controllers/machinesetmigration/machineset_migration_controller_test.go
  • pkg/controllers/machinesetsync/machineset_sync_controller.go
  • pkg/controllers/machinesync/machine_sync_controller.go
  • pkg/controllers/synccommon/applyconfiguration.go
  • pkg/controllers/synccommon/migratestatus.go
  • pkg/controllers/synccommon/migratestatus_test.go
  • pkg/controllers/synccommon/suite_test.go
  • pkg/conversion/test/fuzz/fuzz.go
🚧 Files skipped from review as they are similar to previous changes (5)
  • e2e/machine_migration_capi_authoritative_test.go
  • e2e/machineset_migration_mapi_authoritative_test.go
  • pkg/controllers/synccommon/suite_test.go
  • pkg/controllers/machinesetsync/machineset_sync_controller.go
  • pkg/controllers/synccommon/applyconfiguration.go
🧰 Additional context used
🧬 Code graph analysis (6)
e2e/machine_migration_helpers.go (1)
e2e/framework/framework.go (2)
  • WaitMedium (24-24)
  • RetryMedium (18-18)
pkg/controllers/machinesetmigration/machineset_migration_controller_test.go (2)
e2e/migration_common.go (1)
  • SynchronizedCondition (10-10)
pkg/controllers/common_consts.go (1)
  • SynchronizedCondition (41-41)
pkg/controllers/machinemigration/machine_migration_controller.go (1)
pkg/controllers/synccommon/migratestatus.go (4)
  • IsMigrationCancellationRequested (121-127)
  • AuthoritativeAPIToSynchronizedAPI (131-142)
  • ApplyMigrationStatus (63-79)
  • ApplyMigrationStatusAndResetSyncStatus (42-60)
pkg/controllers/machinesetmigration/machineset_migration_controller.go (2)
pkg/controllers/synccommon/migratestatus.go (4)
  • IsMigrationCancellationRequested (121-127)
  • ApplyMigrationStatus (63-79)
  • AuthoritativeAPIToSynchronizedAPI (131-142)
  • ApplyMigrationStatusAndResetSyncStatus (42-60)
pkg/conversion/mapi2capi/interface.go (1)
  • MachineSet (29-31)
pkg/controllers/synccommon/migratestatus_test.go (1)
pkg/controllers/synccommon/migratestatus.go (1)
  • IsMigrationCancellationRequested (121-127)
e2e/machineset_migration_helpers.go (2)
pkg/conversion/mapi2capi/interface.go (1)
  • MachineSet (29-31)
e2e/framework/framework.go (2)
  • WaitMedium (24-24)
  • RetryMedium (18-18)
🔇 Additional comments (29)
pkg/controllers/machinesync/machine_sync_controller.go (1)

1515-1515: LGTM! SynchronizedAPI field properly preserved during synchronization.

The added line correctly preserves the SynchronizedAPI status field from the existing MAPI machine to the converted MAPI machine, following the same pattern as other preserved fields (AuthoritativeAPI, SynchronizedGeneration, LastOperation, ProviderStatus). This ensures the synchronization state is maintained during CAPI→MAPI machine status updates.

pkg/controllers/machinemigration/machine_migration_controller_test.go (4)

20-20: LGTM: Helpful debugging aid.

The go-cmp import is used at line 225 to provide detailed diff output when the resource-version assertion fails, which aids in debugging test failures.


201-770: LGTM: Comprehensive updates to existing migration tests.

All existing migration scenarios have been systematically updated to handle the new SynchronizedAPI status field:

  • Test setup properly initializes SynchronizedAPI alongside AuthoritativeAPI using the builder pattern
  • Assertions verify both fields in migration completion scenarios (lines 678, 765)
  • The defensive use of cmp.Diff at line 225 provides helpful debugging context when assertions fail
  • The pattern is consistent across all migration directions (MachineAPI ↔ ClusterAPI)

772-897: LGTM: Excellent coverage of migration cancellation scenarios.

The new cancellation tests thoroughly exercise the rollback functionality:

  • Lines 772-806: Cancel migration back to MachineAPI
  • Lines 808-841: Cancel migration back to ClusterAPI
  • Lines 843-896: Cancel to ClusterAPI with proper unpause of CAPI resources

Each scenario correctly:

  1. Simulates a stuck migration state (status.AuthoritativeAPI: Migrating)
  2. Sets status.SynchronizedAPI to reflect the last good API
  3. Reverts spec.AuthoritativeAPI to trigger cancellation detection
  4. Verifies the controller transitions back to the synchronized state
  5. Explicitly confirms no requeue is needed (lines 799, 834, 877)

The paused-resource scenario (lines 843-896) is particularly valuable, ensuring that CAPI resources are properly unpaused when rolling back.


961-999: LGTM: Important invariant verification for migration transitions.

This test verifies a critical behavior: when transitioning from a stable state (ClusterAPI) to Migrating, the SynchronizedAPI field correctly preserves the last synchronized API rather than following AuthoritativeAPI to Migrating.

This preservation is essential for the cancellation detection logic, which relies on comparing spec.AuthoritativeAPI with status.SynchronizedAPI to detect rollback requests.

pkg/conversion/test/fuzz/fuzz.go (1)

764-764: LGTM! Consistent handling of MAPI-only field in fuzz tests.

The SynchronizedAPI field is correctly cleared during fuzzing since it has no CAPI equivalent, consistent with how other MAPI-only fields like AuthoritativeAPI are handled.

Also applies to: 811-811

e2e/machine_migration_helpers.go (2)

172-180: LGTM! Well-structured helper for migration state verification.

The function correctly validates both the Migrating state and the expected SynchronizedAPI value in a single assertion, with clear error messaging.


347-354: LGTM! Clean and focused helper function.

The function provides a reusable way to verify SynchronizedAPI status with proper timeout handling and descriptive messaging.

pkg/controllers/machinesetmigration/machineset_migration_controller.go (5)

138-159: LGTM! Migration cancellation logic is sound.

The code correctly detects cancellation requests and ensures the rollback target is unpaused before transitioning. The call to ensureUnpauseRequestedOnNewAuthoritativeResource handles cases where the migration was cancelled early, before all pausing completed.


179-183: LGTM! Correct synchronization state capture.

The code properly captures the current authoritative API as the synchronized state before transitioning to Migrating, creating the necessary breadcrumb for potential rollback.


216-223: LGTM! Proper migration completion.

The code correctly updates both authoritativeAPI and synchronizedAPI atomically to reflect the new stable state after successful migration, with appropriate sync status reset.


424-432: LGTM! Clean patch helper wrappers.

The helper functions provide a clean interface for atomic status updates, properly delegating to the shared migration status utilities.


255-267: The proposed validation is ineffective and doesn't address the actual risk in the inference logic.

The if/else structure (lines 258-262) guarantees that synchronizedAPI will always be set to one of two valid enum values (ClusterAPISynchronized or MachineAPISynchronized), making the proposed validation redundant—it will never catch an error.

The actual concern should be the input validation, not the output. If spec.AuthoritativeAPI (targetAPI) is MachineAuthorityMigrating or some other unexpected value, the code silently defaults to MachineAPISynchronized, which may be incorrect. The inference logic only works safely when targetAPI is guaranteed to be either MachineAuthorityMachineAPI or MachineAuthorityClusterAPI.

Consider validating that targetAPI matches one of the two expected values before the inference:

if targetAPI != mapiv1beta1.MachineAuthorityMachineAPI && targetAPI != mapiv1beta1.MachineAuthorityClusterAPI {
	return &ctrl.Result{}, fmt.Errorf("cannot infer synchronizedAPI from unexpected targetAPI: %s", targetAPI)
}
pkg/controllers/machinemigration/machine_migration_controller.go (2)

138-159: LGTM! Consistent migration controller implementation.

The migration cancellation detection, synchronizedAPI capture, completion logic, and helper functions correctly mirror the MachineSet controller implementation with appropriate handling for Machine resources.

Also applies to: 179-183, 219-226, 464-472


255-267: Remove the proposed validation—it's redundant and adds no safety.

The inference logic is intentional and correct. In a binary system with only two possible authority states (MachineAPI or ClusterAPI), knowing the target authority reveals the source. The if-else statement deterministically assigns one of the two valid SynchronizedAPI constants, so a runtime validation on the result can never fail. The function already guarantees valid values through its type constraints and logic.

If there are concerns about the inference pattern itself (whether inverting target to infer source is always correct during migrations), that would require domain-specific review, but the proposed code does not address such concerns.

pkg/controllers/machinesetmigration/machineset_migration_controller_test.go (2)

704-814: LGTM! Comprehensive migration cancellation test coverage.

The test contexts thoroughly exercise cancellation scenarios including:

  • Cancellation back to both MachineAPI and ClusterAPI
  • Verification that paused resources are properly unpaused during cancellation
  • Proper state transitions during rollback

876-912: LGTM! Good coverage of Migrating state transition.

The test verifies that when transitioning to Migrating state, the controller correctly preserves the SynchronizedAPI as a snapshot of the source state, which is essential for potential migration cancellation.

e2e/machineset_migration_capi_authoritative_test.go (2)

164-170: LGTM! Good addition of synchronization state verification.

The verifyMachineSetSynchronizedAPI call appropriately verifies that the status reflects MachineAPISynchronized after the authority switch to MachineAPI. This aligns well with the existing verification pattern for paused conditions and synchronized conditions.


203-209: LGTM! Consistent synchronization verification after authority switch.

The verification correctly asserts that the MachineSet's status.synchronizedAPI is set to ClusterAPISynchronized after switching authority back to ClusterAPI, maintaining consistency with the earlier verification at Line 169.

pkg/controllers/synccommon/migratestatus_test.go (1)

26-74: LGTM! Comprehensive test coverage for migration cancellation detection.

The table-driven tests thoroughly cover all critical scenarios:

  • Both directions of migration cancellation (MAPI→CAPI and CAPI→MAPI)
  • In-progress migrations that should not be cancelled
  • Pre-migration states

The test logic correctly validates the cancellation detection algorithm where cancellation is identified when status.authoritativeAPI is Migrating and the spec.authoritativeAPI matches status.synchronizedAPI.

e2e/machineset_migration_helpers.go (2)

221-228: LGTM! Well-structured verification helper.

The verifyMachineSetSynchronizedAPI function follows the established pattern from other verification helpers like verifyMachineSetAuthoritative (Lines 94-101), using appropriate timeouts (WaitMedium/RetryMedium) and clear assertion messages.


230-246: LGTM! Clean refactor using WithTransform pattern.

The introduction of getAWSProviderSpecFromMachineSet as an extraction helper and its use via WithTransform in verifyMAPIMachineSetProviderSpec (Line 234) is an idiomatic Gomega pattern that improves testability and readability by separating extraction logic from assertion logic.

e2e/machine_migration_mapi_authoritative_test.go (3)

27-27: LGTM! Non-functional test structure improvement.

The change from Describe to Context blocks better organizes the test hierarchy and aligns with Ginkgo best practices for distinguishing between test suites (Describe) and test scenarios (Context).

Also applies to: 65-65, 136-136


162-193: LGTM! Comprehensive synchronization state verification throughout the round trip.

The additions of verifyMachineSynchronizedAPI at Lines 166, 178, and 190 ensure that the status.synchronizedAPI field correctly reflects the current synchronized state at each stage of the MAPI→CAPI→MAPI round trip. This provides valuable coverage for the synchronization tracking feature.


209-288: LGTM! Excellent test coverage for migration rollback scenario.

This new test suite validates a critical user workflow described in the PR objectives: canceling a stuck migration by reverting spec.authoritativeAPI to the last successfully synchronized state. The test comprehensively verifies:

  1. Entry into the Migrating state with proper synchronizedAPI tracking (Line 245)
  2. Cancellation detection and rollback when spec is reverted (Lines 247-262)
  3. Successful completion of a subsequent full migration after rollback (Lines 264-275)
  4. Proper cleanup of all resources (Lines 277-286)

The test flow matches the administrator workflow described in the PR objectives and provides strong validation of the cancellation mechanism.

pkg/controllers/synccommon/migratestatus.go (4)

42-60: LGTM! Clean extension to support atomic synchronizedAPI updates.

The addition of the synchronizedAPI parameter and the call to WithSynchronizedAPI (Line 57) enables atomic updates of both authoritativeAPI and synchronizedAPI fields during migration state transitions, which is essential for consistent state management.


63-79: LGTM! Consistent pattern for synchronizedAPI propagation.

The function signature extension and WithSynchronizedAPI call (Line 76) mirror the pattern in ApplyMigrationStatusAndResetSyncStatus, maintaining consistency across migration status update paths.


116-127: LGTM! Correct cancellation detection logic.

The IsMigrationCancellationRequested function correctly identifies when an administrator has reverted spec.authoritativeAPI to the last successfully synchronized state while status.authoritativeAPI is still Migrating. This implements the cancellation workflow described in the PR objectives.

The logic is validated by the comprehensive test coverage in migratestatus_test.go.


129-142: LGTM! Straightforward and correct authority-to-synchronization mapping.

The AuthoritativeAPIToSynchronizedAPI function provides a clean mapping from MachineAuthority to SynchronizedAPI values. The explicit handling of MachineAuthorityMigrating returning an empty string (Line 138) is appropriate since "Migrating" is a transient state that doesn't represent a synchronized API.

@RadekManak
Copy link
Contributor Author

/test unit

@RadekManak
Copy link
Contributor Author

/payload required

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jan 8, 2026

@RadekManak: it appears that you have attempted to use some version of the payload command, but your comment was incorrectly formatted and cannot be acted upon. See the docs for usage info.

@RadekManak
Copy link
Contributor Author

/pipeline required

@openshift-ci-robot
Copy link

Scheduling tests matching the pipeline_run_if_changed or not excluded by pipeline_skip_if_only_changed parameters:
/test e2e-aws-capi-techpreview
/test e2e-aws-ovn
/test e2e-aws-ovn-serial-1of2
/test e2e-aws-ovn-serial-2of2
/test e2e-aws-ovn-techpreview
/test e2e-aws-ovn-techpreview-upgrade
/test e2e-azure-capi-techpreview
/test e2e-azure-ovn-techpreview
/test e2e-azure-ovn-techpreview-upgrade
/test e2e-gcp-capi-techpreview
/test e2e-gcp-ovn-techpreview
/test e2e-metal3-capi-techpreview
/test e2e-openstack-capi-techpreview
/test e2e-openstack-ovn-techpreview
/test e2e-vsphere-capi-techpreview
/test regression-clusterinfra-aws-ipi-techpreview-capi

@damdo
Copy link
Member

damdo commented Jan 8, 2026

/assign @mdbooth

@damdo
Copy link
Member

damdo commented Jan 8, 2026

@RadekManak are the vendor and verify expected to fail for now?

@RadekManak
Copy link
Contributor Author

go: github.com/openshift/api@v0.0.0-20260105114749-aae5635a71a7 (replaced by ../api): reading ../api/go.mod: open /go/src/github.com/openshift/api/go.mod: no such file or directory

Yes, I switched to using a local project folders for development instead of pushing into my fork. I'll correct this when the API PR is merged.

@RadekManak
Copy link
Contributor Author

/testwith openshift/cluster-capi-operator/master/e2e-aws-capi-techpreview openshift/machine-api-operator#1442

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jan 12, 2026

@RadekManak, testwith: could not generate prow job. ERROR:

could not determine ci op config from metadata: got unexpected http 404 status code from configresolver: failed to get config: could not find any config for branch master on repo openshift/cluster-capi-operator

@RadekManak
Copy link
Contributor Author

/testwith openshift/cluster-capi-operator/main/e2e-aws-capi-techpreview openshift/machine-api-operator#1442

Copy link
Contributor

@mdbooth mdbooth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We just discussed this. Unfortunately I don't think this approach is correct.

The purpose of the SynchronizedAPI field is to eliminate the heuristic in isSynchronized which would fail if we were reverting an incomplete migration. Specifically, the SynchronizedAPI field defines deterministically exactly which object the value of SynchronizedGeneration refers to. Therefore SynchronizedAPI should be written by the sync controller whenever we write SynchronizedGeneration.

Additionally, we should ensure that each field is owned by a particular controller and is never touched by any other controller. Therefore the migration controller should not write to this value even to initialise it. It should just wait for the sync controller to initialise it if it is required.

With these changes in place, the additional code path in the migration controller guarded by IsMigrationCancellationRequested is not required. As long as we can unambiguously determine the synchronisation state in all circumstances, the existing logic is already correct. Therefore we should just need to update the logic in isSynchronized to use the SynchronizedAPI field and therefore no longer require the invalid heuristic.

@openshift-ci-robot
Copy link

openshift-ci-robot commented Jan 15, 2026

@RadekManak: This pull request references OCPCLOUD-2998 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Why:

Adds .status.synchronizedAPI field to Machine and MachineSet resources to enable reliable migration cancellation when migrations get stuck at status.authoritativeAPI: Migrating. Without this field, the system cannot determine which API was the migration source when users revert spec.authoritativeAPI, preventing proper rollback to the last known good state. Implements OCPCLOUD-2998.

What:

  • Adds .status.synchronizedAPI field (values: "" | MachineAPI | ClusterAPI) to track the last successfully synchronized API
  • Implements handleMigrationStatusInitialization() in migration controllers to bootstrap empty status fields with proper inference logic
  • Adds IsMigrationCancellationRequested() detection when spec.authoritativeAPI matches status.synchronizedAPI while status.authoritativeAPI == Migrating
  • Updates ApplyMigrationStatus() helpers to atomically set both authoritativeAPI and synchronizedAPI during state transitions

How can it be used:

Administrators can cancel stuck migrations by reverting spec.authoritativeAPI back to the previously synchronized state:

# Migration stuck in progress
status:
 authoritativeAPI: Migrating
 synchronizedAPI: MachineAPI  # Last good state

# Cancel by reverting spec
spec:
 authoritativeAPI: MachineAPI  # Matches synchronizedAPI

# System detects cancellation and rolls back
status:
 authoritativeAPI: MachineAPI
 synchronizedAPI: MachineAPI

The migration controller detects this pattern and transitions back to the synchronized state without requiring manual intervention.

How did you test it:

Unit tests cover status initialization scenarios (both fields empty, only one empty, mid-migration inference), migration cancellation detection logic, and rollback flows.

Adds e2e tests to verify field behavior during migrations.

Notes for the reviewer:

Requires companion PR openshift/machine-api-operator#1442 for API definition vendoring. This PR description was generated with AI assistance.

Summary by CodeRabbit

Release Notes

  • Bug Fixes

  • Improved synchronization state tracking during machine and machineset migrations across API authority transitions.

  • Enhanced machine migration rollback handling and recovery scenarios.

  • Tests

  • Added comprehensive verification tests for machine migration scenarios, including rollback and synchronization state validation.

  • Extended test coverage for authority transitions and resynchronization workflows.

✏️ Tip: You can customize this high-level summary in your review settings.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jan 15, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from mdbooth. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@RadekManak
Copy link
Contributor Author

/testwith openshift/cluster-capi-operator/main/e2e-aws-capi-techpreview openshift/machine-api-operator#1442

@damdo damdo requested a review from mdbooth January 20, 2026 21:59
@RadekManak
Copy link
Contributor Author

/hold
There are still issues with the rollback scenario.

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 28, 2026
Move FORCE prerequisite before the pipe (|) to make it a regular
prerequisite instead of order-only. This ensures binaries are always
rebuilt when running 'make build', allowing Go's build system to handle
incremental compilation based on source changes.
@openshift-ci-robot
Copy link

openshift-ci-robot commented Feb 11, 2026

@RadekManak: This pull request references OCPCLOUD-2998 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Why:

Adds .status.synchronizedAPI field to Machine and MachineSet resources to enable reliable migration cancellation when migrations get stuck at status.authoritativeAPI: Migrating. Without this field, the system cannot determine which API was the migration source when users revert spec.authoritativeAPI, preventing proper rollback to the last known good state. Implements OCPCLOUD-2998.

What:

  • Adds .status.synchronizedAPI field (values: "" | MachineAPI | ClusterAPI) to track the last successfully synchronized API
  • Implements handleMigrationStatusInitialization() in migration controllers to bootstrap empty status fields with proper inference logic
  • Adds IsMigrationCancellationRequested() detection when spec.authoritativeAPI matches status.synchronizedAPI while status.authoritativeAPI == Migrating
  • Updates ApplyMigrationStatus() helpers to atomically set both authoritativeAPI and synchronizedAPI during state transitions

How can it be used:

Administrators can cancel stuck migrations by reverting spec.authoritativeAPI back to the previously synchronized state:

# Migration stuck in progress
status:
 authoritativeAPI: Migrating
 synchronizedAPI: MachineAPI  # Last good state

# Cancel by reverting spec
spec:
 authoritativeAPI: MachineAPI  # Matches synchronizedAPI

# System detects cancellation and rolls back
status:
 authoritativeAPI: MachineAPI
 synchronizedAPI: MachineAPI

The migration controller detects this pattern and transitions back to the synchronized state without requiring manual intervention.

How did you test it:

Unit tests cover status initialization scenarios (both fields empty, only one empty, mid-migration inference), migration cancellation detection logic, and rollback flows.

Adds e2e tests to verify field behavior during migrations.

Notes for the reviewer:

Requires companion PR openshift/machine-api-operator#1442 for API definition vendoring. This PR description was generated with AI assistance.

Summary by CodeRabbit

Release Notes

  • New Features

  • Added API synchronization state tracking to monitor which authority (MachineAPI or ClusterAPI) a resource is synchronized with during migration operations.

  • Implemented migration cancellation logic to automatically revert in-progress migrations when authorities align, improving reliability.

  • Tests

  • Expanded end-to-end test coverage to verify API synchronization state across machine and machineset migration scenarios.

  • Enhanced unit tests to validate synchronization status during authority transitions.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 11, 2026

@RadekManak: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-azure-ovn-techpreview-upgrade f081583 link true /test e2e-azure-ovn-techpreview-upgrade
ci/prow/e2e-aws-capi-techpreview f081583 link true /test e2e-aws-capi-techpreview
ci/prow/e2e-metal3-capi-techpreview f081583 link false /test e2e-metal3-capi-techpreview
ci/prow/e2e-openstack-ovn-techpreview f081583 link true /test e2e-openstack-ovn-techpreview
ci/prow/e2e-gcp-ovn-techpreview f081583 link true /test e2e-gcp-ovn-techpreview
ci/prow/verify-deps 1b3828b link true /test verify-deps
ci/prow/vendor 1b3828b link true /test vendor
ci/prow/lint 1b3828b link true /test lint

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants