Skip to content

Conversation

@skeeey
Copy link
Contributor

@skeeey skeeey commented Dec 11, 2025

No description provided.

@coderabbitai
Copy link

coderabbitai bot commented Dec 11, 2025

Walkthrough

Ensure payload.metadata.name is set to the Resource ID before CloudEvent encoding; add unit tests for codec behavior; switch Maestro gRPC broker to use the codec for encode/decode; add E2E test scenarios and helpers; update docs and example JSON name fields.

Changes

Cohort / File(s) Summary
CloudEvents codec implementation
pkg/client/cloudevents/codec.go
Adds non-exported resetPayloadMetadataNameWithResID(res *api.Resource) error that, when payload contains the work metadata extension, asserts it is a map[string]interface{} and sets metadata["name"] = res.ID; Encode now calls this helper before building the CloudEvent. Minor import formatting change.
CloudEvents codec tests
pkg/client/cloudevents/codec_test.go
New unit tests covering codec creation, data type reporting, Encode/Decode scenarios (including deletion timestamp handling), metadata reset behavior and error paths, and encode→decode roundtrip validations.
gRPC broker refactor to codec
cmd/maestro/server/grpc_broker.go
Replaces manual CloudEvent extension extraction/assembly with NewCodec(source).Decode(...) and NewCodec(source).Encode(...); introduces const source = "maestro" and imports the cloudevents client; removes prior per-field extension wiring and direct cetypes usage.
End-to-end test enhancements
test/e2e/pkg/sourceclient_test.go
Adds a monitor-with-two-sources e2e scenario that runs two independent source Work clients against one Deployment; introduces test helpers and two new test-file-level functions AssertReplicas and NewReadonlyWork; adds related imports.
Test suite import tweak
test/e2e/pkg/suite_test.go
Adjusts imports (google.golang.org/grpc added, k8s.io/klog/v2 reordered); no runtime logic change.
Documentation updates
docs/maestro.md
Notes that Maestro uses the Resource record ID as the ManifestWork name when publishing the Resource CloudEvent after persisting the record.
Resource example JSON updates
docs/resources/appliedmanifestwork.json, docs/resources/cloudevents.spec.maestro.json, docs/resources/cloudevents.status.maestro.json
Update embedded metadata.name / manifestWorkName UUID values; cloudevents.status.maestro.json also adds sequenceid and statushash fields. No structural API changes.

Sequence Diagram(s)

mermaid
sequenceDiagram
autonumber
participant Maestro as Maestro gRPC Broker
participant Codec as CloudEvents Codec
participant CE as CloudEvent (wire)
participant Store as Resource Store / DB

Maestro->>Store: persist Resource record (get res.ID)
Maestro->>Codec: Encode(res)  -- calls resetPayloadMetadataNameWithResID(res)
Codec->>Codec: set payload.metadata.name = res.ID
Codec->>CE: produce CloudEvent (with extensions: resourceid, resourceversion, clustername, deletionTimestamp?)
CE-->>Maestro: CloudEvent published
Note over Maestro,Codec: Reverse flow for status decode
CE->>Codec: incoming CloudEvent
Codec->>Maestro: Decode -> return ResourceStatus (extensions mapped)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Verify resetPayloadMetadataNameWithResID type assertions, error messages, and behavior when ExtensionWorkMeta is absent.
  • Confirm Encode calls helper before CloudEvent construction so manifest name matches res.ID.
  • Review cmd/maestro/server/grpc_broker.go codec-based encode/decode to ensure all CloudEvent extensions and deletion-timestamp semantics are preserved.
  • Validate new unit tests for determinism and coverage of error paths.
  • Check e2e additions (AssertReplicas, NewReadonlyWork) for correctness and potential flakiness.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 13.33% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
Description check ❓ Inconclusive No pull request description was provided by the author, making it impossible to assess relevance to the changeset. Add a description explaining the rationale, implementation details, and impact of using resource ID as the work metadata name.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The pull request title directly summarizes the main change: using resource ID as the work metadata name, which aligns with the core modifications across codec, tests, and documentation.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between b03b4c1 and 9fc6e03.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (9)
  • cmd/maestro/server/grpc_broker.go (2 hunks)
  • docs/maestro.md (1 hunks)
  • docs/resources/appliedmanifestwork.json (1 hunks)
  • docs/resources/cloudevents.spec.maestro.json (1 hunks)
  • docs/resources/cloudevents.status.maestro.json (1 hunks)
  • pkg/client/cloudevents/codec.go (3 hunks)
  • pkg/client/cloudevents/codec_test.go (1 hunks)
  • test/e2e/pkg/sourceclient_test.go (4 hunks)
  • test/e2e/pkg/suite_test.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
  • pkg/client/cloudevents/codec_test.go
  • pkg/client/cloudevents/codec.go
  • cmd/maestro/server/grpc_broker.go
  • test/e2e/pkg/suite_test.go
  • docs/resources/appliedmanifestwork.json
🧰 Additional context used
🧬 Code graph analysis (1)
test/e2e/pkg/sourceclient_test.go (2)
pkg/client/cloudevents/grpcsource/client.go (1)
  • NewMaestroGRPCSourceWorkClient (19-64)
test/performance/pkg/util/util.go (1)
  • Eventually (30-49)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Red Hat Konflux / maestro-e2e-on-pull-request
  • GitHub Check: Red Hat Konflux / maestro-on-pull-request
  • GitHub Check: e2e
  • GitHub Check: upgrade
  • GitHub Check: e2e-broadcast-subscription
  • GitHub Check: e2e-grpc-broker
  • GitHub Check: e2e-with-istio
🔇 Additional comments (9)
docs/resources/cloudevents.spec.maestro.json (1)

77-77: LGTM! Documentation correctly reflects resource ID-based naming.

The metadata.name field now matches the resourceid extension attribute, accurately documenting the PR's objective of using the resource ID as the work metadata name.

docs/resources/cloudevents.status.maestro.json (1)

144-144: LGTM! Status CloudEvent documentation updated consistently.

The metadata.name field correctly matches the resourceid (line 146), maintaining consistency with the spec CloudEvent documentation.

test/e2e/pkg/sourceclient_test.go (6)

3-35: LGTM! Imports appropriately support the new test scenario.

The added imports (encoding/json, slice utilities, appsv1, runtime, ptr, and workv1client) are all utilized in the new two-source monitoring test and helper functions.


608-668: LGTM! BeforeEach setup is well-structured.

The setup correctly:

  • Creates isolated source IDs with random suffixes to prevent test collisions
  • Configures gRPC auth rules for both source clients
  • Establishes a baseline deployment on the agent side
  • Initializes two independent source clients with separate watchers

670-688: LGTM! AfterEach cleanup is robust.

The cleanup correctly handles:

  • Deletion attempts from both source clients
  • NotFound errors without failing the test
  • Deployment cleanup with error handling
  • Watch context cancellation

This addresses the previous suggestion for robust cleanup to prevent resource leaks.


690-806: LGTM! Test logic comprehensively validates multi-source independence.

The test effectively verifies:

  • Two source clients create independent resources despite using the same work name
  • Two separate AppliedManifestWorks are created with distinct UIDs
  • Status updates propagate to both watchers independently
  • Deleting one work doesn't affect the other
  • The underlying deployment is not recreated

The test design correctly leverages the sourceID-based composite key scoping to ensure resource independence.


932-959: Helper function works for current test scenarios.

The AssertReplicas helper correctly:

  • Finds the latest watched work by name (reverse iteration)
  • Checks for synced status feedback
  • Verifies the expected replica count is present

Note: The string-based matching at line 952 was previously discussed. While it could theoretically produce false positives (e.g., checking for 1 might match 10), the developer chose to keep the simpler implementation for the controlled E2E test scenarios.


1005-1058: LGTM! NewReadonlyWork helper correctly constructs a monitoring-only manifest.

The helper appropriately:

  • Generates a unique work name to prevent collisions
  • Includes a test label for cleanup and filtering
  • Creates a minimal Deployment manifest reference (TypeMeta + ObjectMeta only)
  • Configures read-only monitoring via UpdateStrategyTypeReadOnly
  • Captures full resource status via the @ JSONPath

This is the correct approach for monitoring an existing deployment without modifying it, which aligns with the test's objective of verifying status propagation across multiple source clients.

docs/maestro.md (1)

189-190: LGTM! Documentation clearly describes the naming behavior.

The addition accurately documents that "Maestro server uses the Resource record ID as the ManifestWork name," which is the core objective of this PR. The placement after step 4 provides clear context about when this naming is applied.


Comment @coderabbitai help to get the list of available commands and usage tips.

@skeeey
Copy link
Contributor Author

skeeey commented Dec 11, 2025

/hold

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (6)
pkg/client/cloudevents/codec.go (1)

120-135: Consider defensive nil check for res.Payload.

The function assumes res.Payload is non-nil. While the caller context (Encode) appears to have the same assumption, adding a defensive check could prevent panics from unexpected nil values.

 func resetPayloadMetadataNameWithResID(res *api.Resource) error {
+	if res.Payload == nil {
+		return nil
+	}
+
 	metadata, ok := res.Payload[cetypes.ExtensionWorkMeta]
 	if !ok {
pkg/client/cloudevents/codec_test.go (3)

130-141: Test case doesn't fully verify the metadata name reset.

The validation function logs the metadata extension but doesn't assert that the metadata name was actually reset to the resource ID. Consider adding an explicit assertion.

 			validateEvent: func(t *testing.T, evt *cloudevents.Event) {
 				ext := evt.Extensions()
 				// Check resource version - CloudEvents stores numeric extensions as int32
 				if version, ok := ext[cetypes.ExtensionResourceVersion].(int32); !ok || version != 2 {
 					t.Errorf("expected resourceVersion 2 but got: %v (type: %T)", ext[cetypes.ExtensionResourceVersion], ext[cetypes.ExtensionResourceVersion])
 				}
-				// Verify that metadata name was reset to resource ID
-				if meta, ok := ext[cetypes.ExtensionWorkMeta]; ok {
-					t.Logf("Work metadata extension found: %v", meta)
+				// Verify that metadata name was reset to resource ID in payload
+				if meta, ok := ext[cetypes.ExtensionWorkMeta].(map[string]interface{}); ok {
+					if meta["name"] != resourceID {
+						t.Errorf("expected metadata name to be %s but got: %v", resourceID, meta["name"])
+					}
 				}
 			},

304-311: Simplify error prefix matching with strings.HasPrefix.

The current logic for checking if an error message starts with the expected prefix is complex and error-prone.

+import "strings"
+
 // Use Contains check for error messages since some include dynamic IDs
-if len(c.expectedErrorMsg) > 0 && len(err.Error()) >= len(c.expectedErrorMsg) {
-	if err.Error()[:len(c.expectedErrorMsg)] != c.expectedErrorMsg {
-		t.Errorf("expected error to start with %q but got: %q", c.expectedErrorMsg, err.Error())
-	}
-} else {
-	t.Errorf("expected error %q but got: %q", c.expectedErrorMsg, err.Error())
+if !strings.HasPrefix(err.Error(), c.expectedErrorMsg) {
+	t.Errorf("expected error to start with %q but got: %q", c.expectedErrorMsg, err.Error())
 }

340-407: Missing test case for invalid metadata type.

The resetPayloadMetadataNameWithResID function returns an error when metadata is not a map[string]interface{}, but this error path is not tested.

Add a test case to cover the error scenario:

{
    name: "resource with invalid metadata type",
    resource: &api.Resource{
        Meta: api.Meta{
            ID: resourceID,
        },
        Payload: datatypes.JSONMap{
            cetypes.ExtensionWorkMeta: "invalid-string-type",
            "data": map[string]interface{}{
                "manifests": []interface{}{},
            },
        },
    },
    expectedError: true,
},

Then update the test runner to handle the expectedError field.

test/e2e/pkg/sourceclient_test.go (2)

10-10: Consider using standard library or existing dependencies for Contains.

Importing github.com/bxcodec/faker/v3/support/slice solely for slice.Contains introduces an unnecessary dependency. Consider using a simple helper function or k8s.io/apimachinery/pkg/util/sets which is already imported.

-import (
-	"github.com/bxcodec/faker/v3/support/slice"
-)

// Then replace slice.Contains(appliedWorkNames, firstWorkUID) with:
// sets.NewString(appliedWorkNames...).Has(firstWorkUID)

936-946: Fragile JSON string matching for replicas assertion.

Using strings.Contains with an escaped JSON pattern is brittle and may fail if JSON formatting changes. Consider parsing the JSON properly.

 		if meta.IsStatusConditionTrue(manifest.Conditions, "StatusFeedbackSynced") {
-			feedbackJson, err := json.Marshal(manifest.StatusFeedbacks)
-			if err != nil {
-				return err
-			}
-
-			if strings.Contains(string(feedbackJson), fmt.Sprintf(`readyReplicas\":%d`, replicas)) {
-				return nil
+			for _, value := range manifest.StatusFeedbacks.Values {
+				if value.Name == "readyReplicas" && value.Value.Integer != nil {
+					if *value.Value.Integer == int64(replicas) {
+						return nil
+					}
+				}
 			}
 		}
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 7f29bb6 and 84921e3.

📒 Files selected for processing (4)
  • pkg/client/cloudevents/codec.go (3 hunks)
  • pkg/client/cloudevents/codec_test.go (1 hunks)
  • test/e2e/pkg/sourceclient_test.go (4 hunks)
  • test/e2e/pkg/suite_test.go (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
pkg/client/cloudevents/codec.go (2)
pkg/api/resource_types.go (1)
  • Resource (14-27)
pkg/db/migrations/202311151850_add_resources.go (1)
  • Resource (11-25)
pkg/client/cloudevents/codec_test.go (2)
pkg/client/cloudevents/codec.go (1)
  • NewCodec (24-28)
pkg/api/metadata_types.go (1)
  • Meta (51-56)
test/e2e/pkg/sourceclient_test.go (2)
pkg/client/cloudevents/grpcsource/client.go (1)
  • NewMaestroGRPCSourceWorkClient (19-64)
test/performance/pkg/util/util.go (1)
  • Eventually (30-49)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: e2e-broadcast-subscription
  • GitHub Check: e2e-with-istio
  • GitHub Check: upgrade
  • GitHub Check: e2e
  • GitHub Check: e2e-grpc-broker
  • GitHub Check: Red Hat Konflux / maestro-e2e-on-pull-request
  • GitHub Check: Red Hat Konflux / maestro-on-pull-request
🔇 Additional comments (7)
test/e2e/pkg/suite_test.go (1)

16-25: LGTM!

The added imports for grpc and klog/v2 are correctly used in the file—grpc for the client connection type and klog for structured logging in test reporting and debug output.

pkg/client/cloudevents/codec.go (1)

35-40: LGTM!

The integration of resetPayloadMetadataNameWithResID ensures the payload metadata name is consistently set to the resource ID before encoding, which aligns with the PR objective of using resource ID as the work metadata name.

pkg/client/cloudevents/codec_test.go (3)

18-29: LGTM!

Constructor test properly validates that the codec is created with the expected sourceID.


41-103: LGTM!

Comprehensive test case for encoding resources without metadata, including proper validation of event source, type, and all CloudEvents extensions.


409-479: LGTM!

The roundtrip test comprehensively validates that encoding and decoding preserves resource identity and status information.

test/e2e/pkg/sourceclient_test.go (2)

680-795: LGTM!

The test comprehensively validates that two independent source work clients can monitor the same workload with separate ManifestWorks, and properly handles the lifecycle including creation, status updates, and deletion without interfering with each other.


995-1048: LGTM!

NewReadonlyWork correctly constructs a read-only ManifestWork with appropriate feedback rules for monitoring deployment status without modifying it.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
test/e2e/pkg/sourceclient_test.go (1)

639-639: Typo in test description.

The word "fist" should be "first".

Apply this diff:

-			By("create fist work client", func() {
+			By("create first work client", func() {
🧹 Nitpick comments (1)
test/e2e/pkg/sourceclient_test.go (1)

922-949: Consider using structured JSON parsing instead of string matching.

The current implementation uses strings.Contains(string(feedbackJson), fmt.Sprintf(...)) which could have false positives. For more robust validation, consider unmarshaling the JSON and checking the field value directly.

Example refactor:

 func AssertReplicas(watchedWorks []*workv1.ManifestWork, name string, replicas int32) error {
 	var latestWatchedWork *workv1.ManifestWork
 	for i := len(watchedWorks) - 1; i >= 0; i-- {
 		if watchedWorks[i].Name == name {
 			latestWatchedWork = watchedWorks[i]
 			break
 		}
 	}
 
 	if latestWatchedWork == nil {
 		return fmt.Errorf("the work %s not watched", name)
 	}
 
 	for _, manifest := range latestWatchedWork.Status.ResourceStatus.Manifests {
 		if meta.IsStatusConditionTrue(manifest.Conditions, "StatusFeedbackSynced") {
-			feedbackJson, err := json.Marshal(manifest.StatusFeedbacks)
+			type Feedback struct {
+				ReadyReplicas *int32 `json:"readyReplicas"`
+			}
+			var feedbacks []Feedback
+			feedbackJson, err := json.Marshal(manifest.StatusFeedbacks)
 			if err != nil {
 				return err
 			}
-
-			if strings.Contains(string(feedbackJson), fmt.Sprintf(`readyReplicas\":%d`, replicas)) {
+			if err := json.Unmarshal(feedbackJson, &feedbacks); err != nil {
+				return err
+			}
+			for _, fb := range feedbacks {
+				if fb.ReadyReplicas != nil && *fb.ReadyReplicas == replicas {
+					return nil
+				}
+			}
-				return nil
-			}
 		}
 	}
 
 	return fmt.Errorf("the expected replicas %d is not found from feedback", replicas)
 }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 84921e3 and 7f860cf.

📒 Files selected for processing (4)
  • pkg/client/cloudevents/codec.go (3 hunks)
  • pkg/client/cloudevents/codec_test.go (1 hunks)
  • test/e2e/pkg/sourceclient_test.go (4 hunks)
  • test/e2e/pkg/suite_test.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • test/e2e/pkg/suite_test.go
🧰 Additional context used
🧬 Code graph analysis (2)
pkg/client/cloudevents/codec_test.go (2)
pkg/client/cloudevents/codec.go (1)
  • NewCodec (24-28)
pkg/api/metadata_types.go (1)
  • Meta (51-56)
test/e2e/pkg/sourceclient_test.go (1)
pkg/client/cloudevents/grpcsource/client.go (1)
  • NewMaestroGRPCSourceWorkClient (19-64)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Red Hat Konflux / maestro-e2e-on-pull-request
  • GitHub Check: Red Hat Konflux / maestro-on-pull-request
  • GitHub Check: e2e-grpc-broker
  • GitHub Check: e2e-broadcast-subscription
  • GitHub Check: e2e-with-istio
  • GitHub Check: e2e
  • GitHub Check: upgrade
🔇 Additional comments (5)
pkg/client/cloudevents/codec.go (2)

35-39: LGTM! Pre-conversion step correctly ensures payload metadata name.

The pre-conversion step properly calls resetPayloadMetadataNameWithResID and propagates any errors before encoding the resource. This ensures the payload metadata name is unique on the agent side.


120-135: LGTM! Helper function correctly resets payload metadata name.

The helper function correctly:

  • Handles the no-op case when metadata is absent
  • Validates the metadata type and returns an appropriate error
  • Sets the metadata name to the resource ID
  • Mutates the resource in place
test/e2e/pkg/sourceclient_test.go (2)

590-797: LGTM! Comprehensive test for two independent source work clients.

The new test properly validates that two source work clients can independently monitor a single workload, including:

  • Creating two independent ManifestWorks with distinct UIDs
  • Verifying both watchers receive updates
  • Confirming deletion of one work doesn't affect the other
  • Ensuring the underlying deployment UID remains stable

995-1048: LGTM! Well-structured helper for creating read-only ManifestWork.

The NewReadonlyWork function properly constructs a ManifestWork with:

  • Appropriate labels for test identification
  • Read-only update strategy
  • Feedback rules to capture deployment status
pkg/client/cloudevents/codec_test.go (1)

1-482: LGTM! Comprehensive test coverage for codec functionality.

The test suite provides excellent coverage:

  • TestNewCodec validates the constructor
  • TestEventDataType verifies the data type configuration
  • TestEncode covers encoding with and without metadata, with deletion timestamps, and validates all extensions
  • TestDecode covers valid and invalid decoding scenarios including missing extensions and mismatched source IDs
  • TestResetPayloadMetadataNameWithResID validates the helper function behavior for both with and without metadata cases
  • TestEncodeDecodeRoundtrip ensures the full encode/decode cycle maintains data integrity

The error handling in TestResetPayloadMetadataNameWithResID (lines 401-404) correctly checks for unexpected errors.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
test/upgrade/pkg/upgrade_test.go (1)

46-57: Consider separating resource creation from readiness polling.

The pattern of wrapping both Get and Create operations inside Eventually blocks means that non-transient creation errors (e.g., validation failures, permission denials) will be retried for the full 5-minute timeout. This can mask real issues and delay failure feedback.

Recommended approach:

  1. Perform the conditional Get/Create once outside Eventually
  2. Use Eventually only to poll for resource readiness/availability after creation

This separates transient readiness delays from permanent creation failures, providing faster feedback when something is misconfigured.

Example for the first block:

-		By("create a deployment for readonly work", func() {
-			Eventually(func() error {
-				_, err := deployClient.Get(ctx, deployReadonlyName, metav1.GetOptions{})
-				if errors.IsNotFound(err) {
-					deploy := utils.NewDeployment(namespace, deployReadonlyName, 0)
-					_, newErr := deployClient.Create(ctx, deploy, metav1.CreateOptions{})
-					return newErr
-				}
-				if err != nil {
-					return err
-				}
-				return nil
-			}).WithTimeout(timeout).WithPolling(polling).ShouldNot(HaveOccurred())
+		By("create a deployment for readonly work", func() {
+			_, err := deployClient.Get(ctx, deployReadonlyName, metav1.GetOptions{})
+			if errors.IsNotFound(err) {
+				deploy := utils.NewDeployment(namespace, deployReadonlyName, 0)
+				_, err = deployClient.Create(ctx, deploy, metav1.CreateOptions{})
+				Expect(err).NotTo(HaveOccurred())
+			} else {
+				Expect(err).NotTo(HaveOccurred())
+			}
+			// If needed, add Eventually block here to wait for deployment readiness

Also applies to: 64-80, 86-102, 108-124

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 76186ff and e7536d7.

📒 Files selected for processing (2)
  • test/upgrade/pkg/upgrade_test.go (12 hunks)
  • test/upgrade/script/run.sh (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Red Hat Konflux / maestro-e2e-on-pull-request
  • GitHub Check: Red Hat Konflux / maestro-on-pull-request
  • GitHub Check: e2e-with-istio
  • GitHub Check: e2e-broadcast-subscription
  • GitHub Check: e2e
  • GitHub Check: upgrade
  • GitHub Check: e2e-grpc-broker
🔇 Additional comments (3)
test/upgrade/script/run.sh (1)

17-17: LGTM: Timeout increase aligns with test robustness improvements.

The 6x timeout increase (5m → 30m) is appropriate given the new polling mechanisms in the Go tests, which introduce multiple 5-minute Eventually blocks that run sequentially during setup.

test/upgrade/pkg/upgrade_test.go (2)

31-34: LGTM: Standardized timeout and polling constants.

The 5-minute timeout and 10-second polling interval provide consistent, reasonable values for eventual consistency checks across all test scenarios.


145-145: LGTM: Explicit timeout and polling improve test reliability.

Consistently applying explicit .WithTimeout(timeout).WithPolling(polling) to all Eventually assertions makes test behavior predictable and reduces flakiness.

Also applies to: 175-175, 194-194, 204-204, 218-218, 239-239, 250-250, 261-261, 290-290, 304-304

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between e7536d7 and 079335a.

📒 Files selected for processing (2)
  • test/e2e/istio/test.sh (1 hunks)
  • test/upgrade/script/run.sh (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Red Hat Konflux / maestro-on-pull-request
  • GitHub Check: Red Hat Konflux / maestro-e2e-on-pull-request
  • GitHub Check: e2e-with-istio
  • GitHub Check: e2e-grpc-broker
  • GitHub Check: e2e-broadcast-subscription
  • GitHub Check: e2e
  • GitHub Check: upgrade
🔇 Additional comments (2)
test/upgrade/script/run.sh (1)

76-76: Good use of explicit job resource for log retrieval.

The change from label-based log selection to explicit job resource (jobs/maestro-upgrade-tests) is more targeted and aligns with the PR objective. Both success and failure paths now retrieve logs consistently.

Also applies to: 82-82

test/e2e/istio/test.sh (1)

84-84: Good use of explicit job resource for log retrieval.

The success path correctly retrieves logs from the specific job resource with the kubeconfig flag, aligning with the PR objective.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
cmd/maestro/server/grpc_broker.go (2)

25-33: Importing the shared codec and introducing a source constant improves consistency

Using github.com/openshift-online/maestro/pkg/client/cloudevents plus a single const source = "maestro" keeps encode/decode aligned and makes the CloudEvents source explicit in one place in this file. If "maestro" is (or becomes) used in multiple packages, consider lifting it to a shared constant to avoid accidental drift, but it’s fine as-is here.


308-318: encodeResourceSpec refactor aligns spec encoding with shared Codec behavior

Switching encodeResourceSpec to codec.Encode(source, eventType, resource) keeps spec encoding and status decoding in sync through the same abstraction and reduces duplication of CloudEvent construction details. If the codec is stateless, you could optionally reuse a single instance instead of creating it per call, but that’s a micro-optimization and not required.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 2439a89 and 40ecafb.

📒 Files selected for processing (1)
  • cmd/maestro/server/grpc_broker.go (2 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
cmd/maestro/server/grpc_broker.go (2)
pkg/client/cloudevents/codec.go (1)
  • NewCodec (24-28)
pkg/api/resource_types.go (1)
  • Resource (14-27)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Red Hat Konflux / maestro-on-pull-request
  • GitHub Check: Red Hat Konflux / maestro-e2e-on-pull-request
  • GitHub Check: e2e-grpc-broker
  • GitHub Check: e2e-broadcast-subscription
  • GitHub Check: upgrade
  • GitHub Check: e2e-with-istio
  • GitHub Check: e2e
🔇 Additional comments (1)
cmd/maestro/server/grpc_broker.go (1)

302-306: decodeResourceStatus delegation to Codec is a clean simplification

Having decodeResourceStatus do nothing but construct a codec and call Decode removes local parsing details and ensures status decoding logic lives in one place (the codec). This should make it easier to evolve the CloudEvent schema without touching the broker.

@skeeey skeeey force-pushed the res_id branch 2 times, most recently from 4cdf287 to a11abc6 Compare December 12, 2025 04:14
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
test/e2e/pkg/sourceclient_test.go (1)

800-832: Data race: WatchedResult.WatchedWorks is mutated in a goroutine and read concurrently.
This will flake and should fail under -race. Protect the slice (mutex) or serialize access (channel).

 type WatchedResult struct {
-    WatchedWorks []*workv1.ManifestWork
+    mu          sync.RWMutex
+    WatchedWorks []*workv1.ManifestWork
 }

 func StartWatch(ctx context.Context, watcher watch.Interface) *WatchedResult {
     result := &WatchedResult{WatchedWorks: []*workv1.ManifestWork{}}
     go func() {
         ch := watcher.ResultChan()
@@
                 switch event.Type {
                 case watch.Modified:
                     if work, ok := event.Object.(*workv1.ManifestWork); ok {
-                        result.WatchedWorks = append(result.WatchedWorks, work)
+                        result.mu.Lock()
+                        result.WatchedWorks = append(result.WatchedWorks, work)
+                        result.mu.Unlock()
                     }
                 case watch.Deleted:
                     if work, ok := event.Object.(*workv1.ManifestWork); ok {
-                        result.WatchedWorks = append(result.WatchedWorks, work)
+                        result.mu.Lock()
+                        result.WatchedWorks = append(result.WatchedWorks, work)
+                        result.mu.Unlock()
                     }
                 }
             }
         }
     }()
@@
 }

Then, in readers (e.g., AssertReplicas, AssertWatchResult, Consistently blocks), take an RLock()/RUnlock() or copy a snapshot under lock.

♻️ Duplicate comments (1)
test/e2e/pkg/sourceclient_test.go (1)

639-652: “fist” typo is back (and the message loses useful context).
The log should read “first”, and it’s helpful to include which sourceID is being used to debug auth/routing issues. This was flagged before—looks like a regression.

-            By("create fist work client", func() {
+            By(fmt.Sprintf("create first work client (source=%s)", firstSourceID), func() {
                 var err error
                 firstSourceWorkClient, err = grpcsource.NewMaestroGRPCSourceWorkClient(
                     ctx,
                     logger,
                     apiClient,
                     grpcOptions,
                     firstSourceID,
                 )
🧹 Nitpick comments (6)
pkg/client/cloudevents/codec.go (1)

120-135: Note: Input mutation occurs.

This helper mutates res.Payload in place. Since Encode is a public method and callers may not expect their *api.Resource to be modified, consider documenting this side effect in the Encode method's comment, or working on a copy if preservation of the original is important.

cmd/maestro/server/grpc_broker.go (1)

304-317: Consider reusing the codec instance.

Both decodeResourceStatus and encodeResourceSpec create a new Codec instance on each call. Since the codec is stateless (only stores the sourceID), consider creating a package-level codec instance or caching it to avoid repeated allocations.

+var codecInstance = cloudevents.NewCodec(source)
+
 // decodeResourceStatus translates a CloudEvent into a resource containing the status JSON map.
 func decodeResourceStatus(evt *ce.Event) (*api.Resource, error) {
-	codec := cloudevents.NewCodec(source)
-	return codec.Decode(evt)
+	return codecInstance.Decode(evt)
 }

 // encodeResourceSpec translates a resource spec JSON map into a CloudEvent.
 func encodeResourceSpec(resource *api.Resource) (*ce.Event, error) {
 	eventType := types.CloudEventsType{
 		CloudEventsDataType: workpayload.ManifestBundleEventDataType,
 		SubResource:         types.SubResourceSpec,
 		Action:              types.EventAction("create_request"),
 	}

-	codec := cloudevents.NewCodec(source)
-	return codec.Encode(source, eventType, resource)
+	return codecInstance.Encode(source, eventType, resource)
 }
pkg/client/cloudevents/codec_test.go (1)

304-311: Simplify error message validation.

The current error prefix checking logic using string slicing is complex and could panic if the error message is shorter than expected. Use strings.HasPrefix or strings.Contains instead.

+import "strings"
+
 // Use Contains check for error messages since some include dynamic IDs
-if len(c.expectedErrorMsg) > 0 && len(err.Error()) >= len(c.expectedErrorMsg) {
-	if err.Error()[:len(c.expectedErrorMsg)] != c.expectedErrorMsg {
-		t.Errorf("expected error to start with %q but got: %q", c.expectedErrorMsg, err.Error())
-	}
-} else {
-	t.Errorf("expected error %q but got: %q", c.expectedErrorMsg, err.Error())
-}
+if !strings.HasPrefix(err.Error(), c.expectedErrorMsg) {
+	t.Errorf("expected error to start with %q but got: %q", c.expectedErrorMsg, err.Error())
+}
test/setup/env_setup.sh (1)

72-72: Clarify why build step is disabled.

The build step is commented out without explanation. Consider adding a comment explaining why (e.g., "Images are pre-built in CI" or "Uncomment to build images locally").

test/e2e/pkg/sourceclient_test.go (2)

10-10: Avoid github.com/bxcodec/faker/v3/support/slice in prod tests; prefer stdlib slices.Contains.
This extra dep is surprising here; if Go version allows, slices.Contains(appliedWorkNames, firstWorkUID) is clearer and avoids pulling faker utils into e2e tests.


922-949: AssertReplicas is brittle (substring match on marshaled JSON).
This can break with JSON ordering/escaping/schema changes and can produce false positives. Prefer decoding manifest.StatusFeedbacks into a structured form and checking the specific field.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 40ecafb and 4cdf287.

📒 Files selected for processing (7)
  • cmd/maestro/server/grpc_broker.go (2 hunks)
  • docs/maestro.md (1 hunks)
  • pkg/client/cloudevents/codec.go (3 hunks)
  • pkg/client/cloudevents/codec_test.go (1 hunks)
  • test/e2e/pkg/sourceclient_test.go (4 hunks)
  • test/e2e/pkg/suite_test.go (1 hunks)
  • test/setup/env_setup.sh (2 hunks)
✅ Files skipped from review due to trivial changes (1)
  • test/e2e/pkg/suite_test.go
🧰 Additional context used
🧬 Code graph analysis (2)
cmd/maestro/server/grpc_broker.go (2)
pkg/client/cloudevents/codec.go (1)
  • NewCodec (24-28)
pkg/api/resource_types.go (1)
  • Resource (14-27)
pkg/client/cloudevents/codec.go (2)
pkg/api/resource_types.go (1)
  • Resource (14-27)
pkg/db/migrations/202311151850_add_resources.go (1)
  • Resource (11-25)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Red Hat Konflux / maestro-e2e-on-pull-request
  • GitHub Check: Red Hat Konflux / maestro-on-pull-request
  • GitHub Check: upgrade
🔇 Additional comments (3)
pkg/client/cloudevents/codec.go (1)

35-39: LGTM - Ensures unique work metadata name.

The pre-encoding step to reset payload metadata name to resource ID ensures uniqueness on the agent side. Error handling is appropriate.

cmd/maestro/server/grpc_broker.go (1)

32-33: LGTM - Source constant for codec.

Defining the source as a constant improves maintainability and consistency across encode/decode operations.

pkg/client/cloudevents/codec_test.go (1)

412-482: LGTM - Good roundtrip test coverage.

The roundtrip test effectively validates the encode/decode flow by simulating a realistic scenario where a resource is encoded, transformed to a status update, and decoded back.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
docs/maestro.md (1)

38-42: Complete steps 3–5 and fix step 1–2 grammar (currently incomplete/unclear).
Steps 3–5 are placeholders (“The Maestro server…”, “The Maestro agent…”) and don’t describe actions/outcomes; also step 1–2 need minor grammar fixes (“uses”, “to create”, “publishes”).

Proposed patch:

-1. The Consumer (e.g. ClustersService) use Maestro GRPCSourceClient create a ManifestWork
-2. The GRPCSourceClient publish the ManifestWork using CloudEvents
-3. The Maestro server 
-4. The Maestro server 
-5. The Maestro agent
+1. The Consumer (e.g., ClustersService) uses Maestro GRPCSourceClient to create a ManifestWork.
+2. The GRPCSourceClient publishes the ManifestWork as a CloudEvent over MQTT.
+3. The Maestro server receives the CloudEvent, validates it, and persists/enqueues the ManifestWork for processing.
+4. The Maestro server schedules the work to the appropriate agent/cluster, tracks progress, and updates ManifestWork status.
+5. The Maestro agent receives assigned ManifestWork, applies the manifests to the target cluster, and reports status back to the Maestro server.
pkg/client/cloudevents/codec_test.go (1)

105-141: Missing assertion: “encode resource with metadata” doesn’t verify the name reset.
Right now the test only logs the WorkMeta extension, so it won’t fail if the metadata name is not rewritten to resourceID.

 		{
 			name:      "encode resource with metadata",
 			source:    "test-source",
 			eventType: cetypes.CloudEventsType{CloudEventsDataType: workpayload.ManifestBundleEventDataType, SubResource: cetypes.SubResourceSpec, Action: "create"},
 			resource: &api.Resource{
@@
 			},
 			validateEvent: func(t *testing.T, evt *cloudevents.Event) {
 				ext := evt.Extensions()
 				// Check resource version - CloudEvents stores numeric extensions as int32
 				if version, ok := ext[cetypes.ExtensionResourceVersion].(int32); !ok || version != 2 {
 					t.Errorf("expected resourceVersion 2 but got: %v (type: %T)", ext[cetypes.ExtensionResourceVersion], ext[cetypes.ExtensionResourceVersion])
 				}
-				// Verify that metadata name was reset to resource ID
-				if meta, ok := ext[cetypes.ExtensionWorkMeta]; ok {
-					t.Logf("Work metadata extension found: %v", meta)
-				}
+				// Verify that metadata name was reset to resource ID
+				metaAny, ok := ext[cetypes.ExtensionWorkMeta]
+				if !ok {
+					t.Fatalf("expected %q extension to be set", cetypes.ExtensionWorkMeta)
+				}
+				meta, ok := metaAny.(map[string]interface{})
+				if !ok {
+					t.Fatalf("expected %q to be map[string]interface{} but got %T", cetypes.ExtensionWorkMeta, metaAny)
+				}
+				// payload shape here uses: { ..., "metadata": {"name": ...} }
+				mdAny, ok := meta["metadata"]
+				if !ok {
+					t.Fatalf("expected %q to contain \"metadata\" but got: %v", cetypes.ExtensionWorkMeta, meta)
+				}
+				md, ok := mdAny.(map[string]interface{})
+				if !ok {
+					t.Fatalf("expected \"metadata\" to be map[string]interface{} but got %T", mdAny)
+				}
+				if md["name"] != resourceID {
+					t.Errorf("expected metadata.name %q but got: %v", resourceID, md["name"])
+				}
 			},
 		},
🧹 Nitpick comments (1)
pkg/client/cloudevents/codec_test.go (1)

294-312: Brittle error assertion in TestDecode (comment says “Contains”, code does prefix slicing).
This is hard to read and easy to get wrong—prefer strings.HasPrefix (or strings.Contains) directly.

+	"strings"
@@
-				// Use Contains check for error messages since some include dynamic IDs
-				if len(c.expectedErrorMsg) > 0 && len(err.Error()) >= len(c.expectedErrorMsg) {
-					if err.Error()[:len(c.expectedErrorMsg)] != c.expectedErrorMsg {
-						t.Errorf("expected error to start with %q but got: %q", c.expectedErrorMsg, err.Error())
-					}
-				} else {
-					t.Errorf("expected error %q but got: %q", c.expectedErrorMsg, err.Error())
-				}
+				if !strings.HasPrefix(err.Error(), c.expectedErrorMsg) {
+					t.Errorf("expected error to start with %q but got: %q", c.expectedErrorMsg, err.Error())
+				}
 				return
 			}
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 4cdf287 and fd380f0.

📒 Files selected for processing (6)
  • cmd/maestro/server/grpc_broker.go (2 hunks)
  • docs/maestro.md (1 hunks)
  • pkg/client/cloudevents/codec.go (3 hunks)
  • pkg/client/cloudevents/codec_test.go (1 hunks)
  • test/e2e/pkg/sourceclient_test.go (4 hunks)
  • test/e2e/pkg/suite_test.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • pkg/client/cloudevents/codec.go
  • test/e2e/pkg/sourceclient_test.go
  • cmd/maestro/server/grpc_broker.go
🧰 Additional context used
🧬 Code graph analysis (1)
pkg/client/cloudevents/codec_test.go (3)
pkg/client/cloudevents/codec.go (1)
  • NewCodec (24-28)
pkg/errors/errors.go (1)
  • New (102-116)
pkg/api/metadata_types.go (1)
  • Meta (51-56)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Red Hat Konflux / maestro-e2e-on-pull-request
  • GitHub Check: Red Hat Konflux / maestro-on-pull-request
  • GitHub Check: e2e
  • GitHub Check: e2e-grpc-broker
  • GitHub Check: e2e-broadcast-subscription
  • GitHub Check: upgrade
  • GitHub Check: e2e-with-istio
🔇 Additional comments (2)
test/e2e/pkg/suite_test.go (1)

16-18: Import changes look consistent with usage (grpc + klog).
grpc.ClientConn and klog are both referenced in this file, so the import adjustments look fine.

Also applies to: 25-25

pkg/client/cloudevents/codec_test.go (1)

340-410: Good: TestResetPayloadMetadataNameWithResID now fails fast on unexpected errors.
Catching the returned error makes the test more reliable.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
pkg/client/cloudevents/codec_test.go (1)

106-143: Fix metadata-name reset assertion to avoid false-green subtests
Your check only runs if ext[cetypes.ExtensionWorkMeta] is a string; if it’s a map[string]any (or something else), the assertion is silently skipped (Lines 137-142). This can let regressions pass. Also, strings.Contains is a weak signal vs asserting the actual metadata.name.

Since this is the same area as the earlier “verify metadata name reset” feedback, I’m marking it duplicate, but the current implementation still has a correctness gap.

 			validateEvent: func(t *testing.T, evt *cloudevents.Event) {
 				ext := evt.Extensions()
 				// Check resource version - CloudEvents stores numeric extensions as int32
 				if version, ok := ext[cetypes.ExtensionResourceVersion].(int32); !ok || version != 2 {
 					t.Errorf("expected resourceVersion 2 but got: %v (type: %T)", ext[cetypes.ExtensionResourceVersion], ext[cetypes.ExtensionResourceVersion])
 				}
-				// Verify that metadata name was reset to resource ID
-				if meta, ok := ext[cetypes.ExtensionWorkMeta].(string); ok {
-					if !strings.Contains(meta, resourceID) {
-						t.Errorf("expected metadata name %s, but got: %v", resourceID, meta)
-					}
-				}
+				// Verify that metadata name was reset to resource ID (fail if the shape is unexpected)
+				metaAny, ok := ext[cetypes.ExtensionWorkMeta]
+				if !ok {
+					t.Fatalf("expected %q extension to be set", cetypes.ExtensionWorkMeta)
+				}
+				metaMap, ok := metaAny.(map[string]interface{})
+				if !ok {
+					t.Fatalf("expected %q extension to be map[string]interface{} but got %T", cetypes.ExtensionWorkMeta, metaAny)
+				}
+				mdAny, ok := metaMap["metadata"]
+				if !ok {
+					t.Fatalf("expected %q.metadata to be set", cetypes.ExtensionWorkMeta)
+				}
+				mdMap, ok := mdAny.(map[string]interface{})
+				if !ok {
+					t.Fatalf("expected %q.metadata to be map[string]interface{} but got %T", cetypes.ExtensionWorkMeta, mdAny)
+				}
+				if mdMap["name"] != resourceID {
+					t.Errorf("expected metadata name %q but got: %v", resourceID, mdMap["name"])
+				}
 			},
🧹 Nitpick comments (2)
pkg/client/cloudevents/codec_test.go (2)

301-314: Replace manual prefix slicing with strings.HasPrefix (and align the comment)
The comment says “Contains check” but the code does a manual prefix compare via slicing (Lines 306-313), which is harder to read/maintain.

 			if c.expectedErrorMsg != "" {
 				if err == nil {
 					t.Errorf("expected error but got nil")
 					return
 				}
-				// Use Contains check for error messages since some include dynamic IDs
-				if len(c.expectedErrorMsg) > 0 && len(err.Error()) >= len(c.expectedErrorMsg) {
-					if err.Error()[:len(c.expectedErrorMsg)] != c.expectedErrorMsg {
-						t.Errorf("expected error to start with %q but got: %q", c.expectedErrorMsg, err.Error())
-					}
-				} else {
-					t.Errorf("expected error %q but got: %q", c.expectedErrorMsg, err.Error())
-				}
+				// Use prefix match for error messages that may include dynamic suffixes
+				if !strings.HasPrefix(err.Error(), c.expectedErrorMsg) {
+					t.Errorf("expected error to start with %q but got: %q", c.expectedErrorMsg, err.Error())
+				}
 				return
 			}

(You already import strings in this file.)


232-239: Don’t ignore evt.SetData(...) errors in tests
Even if it’s unlikely, ignoring these errors makes debugging much harder when something changes upstream.

-				_ = evt.SetData(cloudevents.ApplicationJSON, map[string]interface{}{
+				if err := evt.SetData(cloudevents.ApplicationJSON, map[string]interface{}{
 					"conditions": []interface{}{
 						map[string]interface{}{
 							"type":   "Applied",
 							"status": "True",
 						},
 					},
-				})
+				}); err != nil {
+					t.Fatalf("failed to set event data: %v", err)
+				}

Apply similarly at the other SetData call sites (Lines 289-291, 457-464).

Also applies to: 289-291, 457-464

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between fd380f0 and 9384e6d.

📒 Files selected for processing (1)
  • pkg/client/cloudevents/codec_test.go (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Red Hat Konflux / maestro-e2e-on-pull-request
  • GitHub Check: upgrade
  • GitHub Check: e2e
  • GitHub Check: e2e-grpc-broker
  • GitHub Check: e2e-with-istio
  • GitHub Check: e2e-broadcast-subscription
  • GitHub Check: Red Hat Konflux / maestro-on-pull-request

@skeeey skeeey force-pushed the res_id branch 2 times, most recently from 86764b3 to dfdfec0 Compare December 19, 2025 06:38
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (3)
test/setup/env_setup.sh (1)

111-119: Image loading inconsistent with container tool logic and duplicates existing loads.

Lines 111-112 duplicate the maestro image loading already performed in lines 99-100 (Docker) or 103-108 (Podman), but load from a different registry (quay.io/redhat-user-workloads) without version tags. Additionally, all these kind load docker-image commands are placed outside the container tool conditional block (lines 98-109), which means they will fail or behave unexpectedly if the user is using Podman.

🔎 Proposed fix

Remove the duplicate maestro image loads (lines 111-112) and move the additional image loads (lines 113-117) inside the appropriate conditional blocks:

 if [ "$container_tool" = "docker" ]; then
     kind load docker-image ${external_image_registry}/maestro/maestro:$image_tag --name maestro
     kind load docker-image ${external_image_registry}/maestro/maestro-e2e:$image_tag --name maestro
+    kind load docker-image quay.io/nginx/nginx-unprivileged --name=maestro
+    kind load docker-image docker.io/istio/proxyv2:1.26.0 --name=maestro
+    kind load docker-image docker.io/istio/pilot:1.26.0 --name=maestro
+    kind load docker-image quay.io/maestro/postgres:17.2 --name=maestro
+    kind load docker-image quay.io/maestro/eclipse-mosquitto:2.0.18 --name=maestro
 else
     # related issue: https://github.com/kubernetes-sigs/kind/issues/2038
     podman save ${external_image_registry}/maestro/maestro:$image_tag -o /tmp/maestro.tar
     kind load image-archive /tmp/maestro.tar --name maestro
     rm /tmp/maestro.tar
     podman save ${external_image_registry}/maestro/maestro-e2e:$image_tag -o /tmp/maestro-e2e.tar
     kind load image-archive /tmp/maestro-e2e.tar --name maestro
     rm /tmp/maestro-e2e.tar
+    # Load additional images for Podman
+    for img in "quay.io/nginx/nginx-unprivileged" "docker.io/istio/proxyv2:1.26.0" "docker.io/istio/pilot:1.26.0" "quay.io/maestro/postgres:17.2" "quay.io/maestro/eclipse-mosquitto:2.0.18"; do
+        podman save "$img" -o "/tmp/$(echo $img | tr '/:' '_').tar"
+        kind load image-archive "/tmp/$(echo $img | tr '/:' '_').tar" --name maestro
+        rm "/tmp/$(echo $img | tr '/:' '_').tar"
+    done
 fi
-
-kind load docker-image quay.io/redhat-user-workloads/maestro-rhtap-tenant/maestro/maestro-e2e --name=maestro
-kind load docker-image quay.io/redhat-user-workloads/maestro-rhtap-tenant/maestro/maestro --name=maestro
-kind load docker-image quay.io/nginx/nginx-unprivileged --name=maestro
-kind load docker-image docker.io/istio/proxyv2:1.26.0 --name=maestro
-kind load docker-image docker.io/istio/pilot:1.26.0 --name=maestro
-kind load docker-image quay.io/maestro/postgres:17.2 --name=maestro
-kind load docker-image quay.io/maestro/eclipse-mosquitto:2.0.18 --name=maestro
test/e2e/pkg/sourceclient_test.go (2)

639-639: Typo: "fist" should be "first".

-			By("create fist work client", func() {
+			By("create first work client", func() {

670-678: Improve cleanup to handle both works and tolerate failures.

The AfterEach only deletes the work from secondSourceWorkClient. If the test fails before the first work is deleted (line 760), it will leak. Additionally, the cleanup should tolerate NotFound errors to handle partial test failures gracefully.

🔎 Proposed fix
 		AfterEach(func() {
+			// Try to cleanup both works, ignoring NotFound errors
+			err := firstSourceWorkClient.ManifestWorks(agentTestOpts.consumerName).Delete(ctx, work.Name, metav1.DeleteOptions{})
+			if err != nil && !errors.IsNotFound(err) {
+				// Log but don't fail cleanup
+				GinkgoWriter.Printf("Warning: failed to cleanup first work: %v\n", err)
+			}
+
-			err := secondSourceWorkClient.ManifestWorks(agentTestOpts.consumerName).Delete(ctx, work.Name, metav1.DeleteOptions{})
+			err = secondSourceWorkClient.ManifestWorks(agentTestOpts.consumerName).Delete(ctx, work.Name, metav1.DeleteOptions{})
-			Expect(err).ShouldNot(HaveOccurred())
+			if err != nil && !errors.IsNotFound(err) {
+				Expect(err).ShouldNot(HaveOccurred())
+			}
 
 			err = agentTestOpts.kubeClientSet.AppsV1().Deployments("default").Delete(ctx, deployName, metav1.DeleteOptions{})
 			Expect(err).To(Succeed())
 
 			watchCancel()
 		})
🧹 Nitpick comments (2)
pkg/client/cloudevents/codec_test.go (1)

307-313: Consider simplifying error message validation.

The current prefix-matching logic with length checks and slicing is harder to follow than necessary.

🔎 Simpler alternative
-			// Use Contains check for error messages since some include dynamic IDs
-			if len(c.expectedErrorMsg) > 0 && len(err.Error()) >= len(c.expectedErrorMsg) {
-				if err.Error()[:len(c.expectedErrorMsg)] != c.expectedErrorMsg {
-					t.Errorf("expected error to start with %q but got: %q", c.expectedErrorMsg, err.Error())
-				}
-			} else {
-				t.Errorf("expected error %q but got: %q", c.expectedErrorMsg, err.Error())
-			}
+			// Use prefix check for error messages since some include dynamic IDs
+			if !strings.HasPrefix(err.Error(), c.expectedErrorMsg) {
+				t.Errorf("expected error to start with %q but got: %q", c.expectedErrorMsg, err.Error())
+			}
test/e2e/pkg/sourceclient_test.go (1)

922-949: Consider more robust JSON parsing for status feedback.

The function works correctly but uses string matching on serialized JSON (line 942). While functional, this could be more robust by unmarshaling the feedback into a typed structure.

🔎 Optional improvement
 func AssertReplicas(watchedWorks []*workv1.ManifestWork, name string, replicas int32) error {
 	var latestWatchedWork *workv1.ManifestWork
 	for i := len(watchedWorks) - 1; i >= 0; i-- {
 		if watchedWorks[i].Name == name {
 			latestWatchedWork = watchedWorks[i]
 			break
 		}
 	}
 
 	if latestWatchedWork == nil {
 		return fmt.Errorf("the work %s not watched", name)
 	}
 
 	for _, manifest := range latestWatchedWork.Status.ResourceStatus.Manifests {
 		if meta.IsStatusConditionTrue(manifest.Conditions, "StatusFeedbackSynced") {
-			feedbackJson, err := json.Marshal(manifest.StatusFeedbacks)
+			// Parse feedback values more robustly
+			for _, feedback := range manifest.StatusFeedbacks.Values {
+				if feedback.Name == "readyReplicas" {
+					var readyReplicas int32
+					if err := json.Unmarshal([]byte(*feedback.Value.JsonRaw), &readyReplicas); err != nil {
+						return fmt.Errorf("failed to parse readyReplicas: %w", err)
+					}
+					if readyReplicas == replicas {
+						return nil
+					}
+				}
+			}
-			if err != nil {
-				return err
-			}
-
-			if strings.Contains(string(feedbackJson), fmt.Sprintf(`readyReplicas\":%d`, replicas)) {
-				return nil
-			}
 		}
 	}
 
 	return fmt.Errorf("the expected replicas %d is not found from feedback", replicas)
 }

Note: This assumes the feedback structure has a Values field with Name and Value.JsonRaw. Adjust based on the actual API structure.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 9384e6d and 86764b3.

📒 Files selected for processing (8)
  • Makefile (1 hunks)
  • cmd/maestro/server/grpc_broker.go (2 hunks)
  • pkg/client/cloudevents/codec.go (3 hunks)
  • pkg/client/cloudevents/codec_test.go (1 hunks)
  • test/e2e/pkg/sourceclient_test.go (4 hunks)
  • test/e2e/pkg/suite_test.go (1 hunks)
  • test/setup/env_setup.sh (1 hunks)
  • test/upgrade/test.sh (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • pkg/client/cloudevents/codec.go
  • test/e2e/pkg/suite_test.go
  • cmd/maestro/server/grpc_broker.go
🧰 Additional context used
🧬 Code graph analysis (2)
test/e2e/pkg/sourceclient_test.go (1)
pkg/client/cloudevents/grpcsource/client.go (1)
  • NewMaestroGRPCSourceWorkClient (19-64)
pkg/client/cloudevents/codec_test.go (3)
pkg/client/cloudevents/codec.go (1)
  • NewCodec (24-28)
pkg/errors/errors.go (1)
  • New (102-116)
pkg/api/metadata_types.go (1)
  • Meta (51-56)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Red Hat Konflux / maestro-e2e-on-pull-request
  • GitHub Check: Red Hat Konflux / maestro-on-pull-request
  • GitHub Check: e2e-grpc-broker
  • GitHub Check: e2e
  • GitHub Check: upgrade
  • GitHub Check: e2e-with-istio
  • GitHub Check: e2e-broadcast-subscription
🔇 Additional comments (12)
pkg/client/cloudevents/codec_test.go (7)

19-30: LGTM!

The constructor test properly validates codec creation and sourceID assignment.


32-40: LGTM!

The data type validation is correct.


82-103: LGTM!

The validation thoroughly checks all CloudEvents extensions including source, type, resourceID, version, clusterName, and deletion timestamp handling.


164-176: LGTM!

The deletion timestamp validation correctly checks that the extension is set and that event ID and time are populated.


220-295: LGTM!

The decode test cases comprehensively cover valid scenarios and error conditions including invalid type, unsupported data type, missing extensions, and source ID mismatches.


343-417: LGTM!

The metadata reset tests properly validate both scenarios (with and without metadata) and include correct error handling and type assertions as addressed in previous reviews.


419-489: LGTM!

The roundtrip test properly validates that encoding and decoding preserve key resource fields (ID, Version, ConsumerName) and that status data is populated correctly.

test/upgrade/test.sh (1)

23-42: Clarify the intent of commented-out upgrade test steps.

Multiple test steps are commented out (initial deployment, workserver run, e2e tests, agent deployment), which makes the upgrade test workflow incomplete. The PR is marked with /hold which might be related.

Could you clarify:

  1. Is this intentional for phased rollout, or should these steps be re-enabled?
  2. Are there specific blockers preventing these steps from running?
  3. Should these commented lines be removed if they're permanently disabled, or documented with TODO/FIXME if temporarily disabled?

This will help ensure the upgrade test workflow is clear for future maintainers.

Makefile (1)

504-506: LGTM - Clean decoupling of upgrade-test.

Making upgrade-test a standalone target is a reasonable change that allows it to be invoked independently. Preserving the old dependencies as a comment is good practice for maintaining context.

test/e2e/pkg/sourceclient_test.go (3)

5-31: LGTM - Import additions support new test functionality.

The new imports for JSON handling, slice utilities, Deployment types, runtime extensions, pointer helpers, and work client interface are all appropriately used in the new test code.


680-796: LGTM - Well-structured test with comprehensive verification.

The test properly validates that two source clients with different source IDs can independently manage the same workload:

  • Creates two separate works (verified by different UIDs)
  • Confirms two independent AppliedManifestWorks on the agent
  • Verifies status feedback propagates to both watchers
  • Validates deletion of one work doesn't affect the other or the underlying deployment

The use of work.Name throughout is correct given the source client's internal composite ID handling.


995-1048: LGTM - Well-configured read-only work helper.

The NewReadonlyWork helper is well-implemented:

  • Properly uses runtime.RawExtension with the Object field for the Deployment
  • Configures JSONPath feedback rules to capture the entire resource state
  • Correctly sets UpdateStrategyTypeReadOnly to monitor without modifying the deployment
  • Includes appropriate labels for test identification

This enables the test to verify status feedback from existing resources without interfering with their state.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
test/e2e/pkg/sourceclient_test.go (2)

639-639: Fix typo: "fist" should be "first".

The comment has a typo that was previously flagged but not corrected.

🔎 Proposed fix
-			By("create fist work client", func() {
+			By("create first work client", func() {

670-678: Improve cleanup robustness.

The AfterEach only deletes via secondSourceWorkClient, but the test creates two works (one per client). If the test fails before line 760 (where the first work is deleted), the first work won't be cleaned up.

Additionally, watchCancel() is called after the delete operations. If deletes hang or are slow, the watch continues consuming resources.

🔎 Proposed improvements
 		AfterEach(func() {
+			// Cancel watches first to stop consuming resources
+			watchCancel()
+
+			// Clean up both works, ignoring NotFound errors
+			_ = firstSourceWorkClient.ManifestWorks(agentTestOpts.consumerName).Delete(ctx, work.Name, metav1.DeleteOptions{})
 			err := secondSourceWorkClient.ManifestWorks(agentTestOpts.consumerName).Delete(ctx, work.Name, metav1.DeleteOptions{})
 			Expect(err).ShouldNot(HaveOccurred())
 
 			err = agentTestOpts.kubeClientSet.AppsV1().Deployments("default").Delete(ctx, deployName, metav1.DeleteOptions{})
 			Expect(err).To(Succeed())
-
-			watchCancel()
 		})
🧹 Nitpick comments (1)
test/e2e/pkg/sourceclient_test.go (1)

922-949: Refactor to use proper JSON unmarshaling instead of string matching.

The function uses string matching on marshaled JSON (line 942) to find readyReplicas, which is fragile and can break with formatting changes or different JSON encoders.

🔎 Proposed refactor
 func AssertReplicas(watchedWorks []*workv1.ManifestWork, name string, replicas int32) error {
 	var latestWatchedWork *workv1.ManifestWork
 	for i := len(watchedWorks) - 1; i >= 0; i-- {
 		if watchedWorks[i].Name == name {
 			latestWatchedWork = watchedWorks[i]
 			break
 		}
 	}
 
 	if latestWatchedWork == nil {
 		return fmt.Errorf("the work %s not watched", name)
 	}
 
 	for _, manifest := range latestWatchedWork.Status.ResourceStatus.Manifests {
 		if meta.IsStatusConditionTrue(manifest.Conditions, "StatusFeedbackSynced") {
-			feedbackJson, err := json.Marshal(manifest.StatusFeedbacks)
-			if err != nil {
-				return err
-			}
-
-			if strings.Contains(string(feedbackJson), fmt.Sprintf(`readyReplicas\":%d`, replicas)) {
-				return nil
-			}
+			for _, feedback := range manifest.StatusFeedbacks.Values {
+				if feedback.Name == "resource" {
+					var resource map[string]interface{}
+					if err := json.Unmarshal([]byte(feedback.Value.Raw), &resource); err != nil {
+						return fmt.Errorf("failed to unmarshal resource feedback: %v", err)
+					}
+					if status, ok := resource["status"].(map[string]interface{}); ok {
+						if readyReplicas, ok := status["readyReplicas"].(float64); ok && int32(readyReplicas) == replicas {
+							return nil
+						}
+					}
+				}
+			}
 		}
 	}
 
 	return fmt.Errorf("the expected replicas %d is not found from feedback", replicas)
 }
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 86764b3 and dfdfec0.

📒 Files selected for processing (5)
  • cmd/maestro/server/grpc_broker.go (2 hunks)
  • pkg/client/cloudevents/codec.go (3 hunks)
  • pkg/client/cloudevents/codec_test.go (1 hunks)
  • test/e2e/pkg/sourceclient_test.go (4 hunks)
  • test/e2e/pkg/suite_test.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • pkg/client/cloudevents/codec.go
  • test/e2e/pkg/suite_test.go
  • cmd/maestro/server/grpc_broker.go
🧰 Additional context used
🧬 Code graph analysis (1)
test/e2e/pkg/sourceclient_test.go (2)
pkg/client/cloudevents/grpcsource/client.go (1)
  • NewMaestroGRPCSourceWorkClient (19-64)
test/performance/pkg/util/util.go (1)
  • Eventually (30-49)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Red Hat Konflux / maestro-e2e-on-pull-request
  • GitHub Check: Red Hat Konflux / maestro-on-pull-request
  • GitHub Check: e2e-broadcast-subscription
  • GitHub Check: upgrade
  • GitHub Check: e2e
  • GitHub Check: e2e-with-istio
  • GitHub Check: e2e-grpc-broker
🔇 Additional comments (3)
pkg/client/cloudevents/codec_test.go (1)

1-489: LGTM! Comprehensive test coverage.

The test file provides thorough coverage of codec operations including creation, encoding, decoding, metadata reset, and roundtrip scenarios. All past review comments have been properly addressed.

test/e2e/pkg/sourceclient_test.go (2)

3-35: LGTM! Appropriate imports for the new test.

The new imports support JSON handling, slice utilities, Kubernetes apps/v1 types, runtime extensions, pointer helpers, and work client interfaces, all of which are used in the new dual-source monitoring test.


995-1048: LGTM! Well-structured read-only work constructor.

The function correctly creates a ManifestWork with read-only update strategy and feedback rules configured to monitor deployment status. This aligns well with the test objective of having two independent sources monitor the same workload without reconciliation conflicts.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
test/e2e/pkg/sourceclient_test.go (2)

639-639: Typo: "fist" should be "first".

The log message still contains "fist" instead of "first".

🔎 Proposed fix
-			By("create fist work client", func() {
+			By("create first work client", func() {

670-678: AfterEach may leave the first work orphaned if the test fails early.

The cleanup only deletes from secondSourceWorkClient. If the test fails before the "delete the first work" step (line 760), the first work will remain. Consider cleaning up both works in AfterEach with errors.IsNotFound checks to handle cases where the work was already deleted.

🔎 Proposed fix
 		AfterEach(func() {
+			// Clean up first work (ignore NotFound if already deleted during test)
+			_ = firstSourceWorkClient.ManifestWorks(agentTestOpts.consumerName).Delete(ctx, work.Name, metav1.DeleteOptions{})
+
 			err := secondSourceWorkClient.ManifestWorks(agentTestOpts.consumerName).Delete(ctx, work.Name, metav1.DeleteOptions{})
-			Expect(err).ShouldNot(HaveOccurred())
+			if err != nil && !errors.IsNotFound(err) {
+				Expect(err).ShouldNot(HaveOccurred())
+			}
 
 			err = agentTestOpts.kubeClientSet.AppsV1().Deployments("default").Delete(ctx, deployName, metav1.DeleteOptions{})
 			Expect(err).To(Succeed())
🧹 Nitpick comments (3)
test/e2e/pkg/sourceclient_test.go (1)

10-10: Consider using stdlib slices package instead of faker's slice utility.

The github.com/bxcodec/faker/v3/support/slice import is used only for slice.Contains. Since Go 1.21, the standard library provides slices.Contains which is more appropriate for this use case and avoids pulling in a test data generation library for a simple utility function.

🔎 Proposed fix
-	"github.com/bxcodec/faker/v3/support/slice"
+	"slices"

And update usage at lines 721 and 725:

-	if !slice.Contains(appliedWorkNames, firstWorkUID) {
+	if !slices.Contains(appliedWorkNames, firstWorkUID) {
pkg/client/cloudevents/codec.go (1)

120-135: Consider documenting the mutation side effect.

The function mutates res.Payload in place, which modifies the caller's resource. While this is likely intentional for the encoding path, it would be helpful to document this behavior in a comment to prevent unexpected side effects if the resource is reused elsewhere.

🔎 Proposed documentation improvement
+// resetPayloadMetadataNameWithResID sets the payload metadata name to the resource ID.
+// This function mutates the resource's Payload in place.
 func resetPayloadMetadataNameWithResID(res *api.Resource) error {
pkg/client/cloudevents/codec_test.go (1)

309-316: Simplify error message prefix checking.

The current error checking logic is more complex than needed. Consider using strings.HasPrefix for cleaner code.

🔎 Proposed simplification
-				// Use Contains check for error messages since some include dynamic IDs
-				if len(c.expectedErrorMsg) > 0 && len(err.Error()) >= len(c.expectedErrorMsg) {
-					if err.Error()[:len(c.expectedErrorMsg)] != c.expectedErrorMsg {
-						t.Errorf("expected error to start with %q but got: %q", c.expectedErrorMsg, err.Error())
-					}
-				} else {
-					t.Errorf("expected error %q but got: %q", c.expectedErrorMsg, err.Error())
+				if !strings.HasPrefix(err.Error(), c.expectedErrorMsg) {
+					t.Errorf("expected error to start with %q but got: %q", c.expectedErrorMsg, err.Error())
 				}
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between dfdfec0 and 34204d9.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (5)
  • cmd/maestro/server/grpc_broker.go (2 hunks)
  • pkg/client/cloudevents/codec.go (3 hunks)
  • pkg/client/cloudevents/codec_test.go (1 hunks)
  • test/e2e/pkg/sourceclient_test.go (4 hunks)
  • test/e2e/pkg/suite_test.go (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
pkg/client/cloudevents/codec.go (2)
pkg/api/resource_types.go (1)
  • Resource (14-27)
pkg/db/migrations/202311151850_add_resources.go (1)
  • Resource (11-25)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Red Hat Konflux / maestro-e2e-on-pull-request
  • GitHub Check: Red Hat Konflux / maestro-on-pull-request
  • GitHub Check: e2e
  • GitHub Check: e2e-broadcast-subscription
  • GitHub Check: e2e-grpc-broker
  • GitHub Check: e2e-with-istio
  • GitHub Check: upgrade
🔇 Additional comments (10)
test/e2e/pkg/suite_test.go (1)

16-25: LGTM - Import reorganization.

The import changes correctly add the grpc and klog/v2 packages with proper grouping. Both imports are used in the existing code (grpcConn variable and logging functions).

test/e2e/pkg/sourceclient_test.go (1)

995-1048: LGTM - Well-structured helper for read-only ManifestWork.

The NewReadonlyWork helper correctly constructs a read-only ManifestWork with proper feedback rules for monitoring deployment status. The test label enables filtering in the test assertions.

cmd/maestro/server/grpc_broker.go (3)

26-33: LGTM - Clean refactoring to codec-based approach.

The addition of the source constant and cloudevents import provides a clean, consistent approach for CloudEvent encoding/decoding. Using a constant ensures the source identifier is consistent across all usages.


294-297: LGTM - Simplified decode using codec.

The refactoring cleanly delegates CloudEvent decoding to the codec, which handles all extension extraction and validation internally.


300-308: LGTM - Encode refactoring with proper event type configuration.

The encode function properly configures the event type and delegates to the codec, which will now ensure the resource ID is used as the payload metadata name via resetPayloadMetadataNameWithResID.

pkg/client/cloudevents/codec.go (1)

35-42: LGTM - Proper integration of metadata name reset.

The call to resetPayloadMetadataNameWithResID is correctly placed before the CloudEvent conversion and handles errors appropriately. The comment clearly explains the purpose of ensuring unique payload metadata names on the agent side.

pkg/client/cloudevents/codec_test.go (4)

19-40: LGTM - Basic codec construction tests.

The tests properly validate codec initialization and event data type configuration.


42-209: LGTM - Comprehensive encode tests.

The test cases cover the key encoding scenarios: without metadata, with metadata (verifying the name reset to resource ID), and with deletion timestamp. The validation functions properly check extensions and event properties.


345-419: LGTM - Good test coverage for metadata reset helper.

The tests properly validate both paths: no-op when metadata is absent, and name reset when metadata exists. Error handling was properly addressed from previous review feedback.


421-491: LGTM - Valuable roundtrip integration test.

The test validates the complete encode-decode flow, simulating a realistic status update scenario. This ensures the codec maintains field integrity across the transformation.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
test/e2e/pkg/sourceclient_test.go (1)

10-10: Consider using a lighter dependency for slice utilities.

The faker library is primarily designed for generating fake data in tests. Using it solely for slice.Contains (lines 721, 725) introduces a heavyweight dependency for a simple utility function.

🔎 Recommended alternatives

Option 1 (if Go 1.21+): Use the standard slices package:

-	"github.com/bxcodec/faker/v3/support/slice"
+	"slices"

Then replace slice.Contains(appliedWorkNames, firstWorkUID) with slices.Contains(appliedWorkNames, firstWorkUID).

Option 2: Implement a simple helper:

func contains(slice []string, item string) bool {
	for _, s := range slice {
		if s == item {
			return true
		}
	}
	return false
}
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 34204d9 and f8ea7fb.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (9)
  • cmd/maestro/server/grpc_broker.go (2 hunks)
  • docs/maestro.md (1 hunks)
  • docs/resources/appliedmanifestwork.json (1 hunks)
  • docs/resources/cloudevents.spec.maestro.json (1 hunks)
  • docs/resources/cloudevents.status.maestro.json (1 hunks)
  • pkg/client/cloudevents/codec.go (3 hunks)
  • pkg/client/cloudevents/codec_test.go (1 hunks)
  • test/e2e/pkg/sourceclient_test.go (4 hunks)
  • test/e2e/pkg/suite_test.go (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • docs/maestro.md
🚧 Files skipped from review as they are similar to previous changes (3)
  • pkg/client/cloudevents/codec.go
  • test/e2e/pkg/suite_test.go
  • pkg/client/cloudevents/codec_test.go
🧰 Additional context used
🧬 Code graph analysis (2)
cmd/maestro/server/grpc_broker.go (2)
pkg/client/cloudevents/codec.go (1)
  • NewCodec (24-28)
pkg/api/resource_types.go (1)
  • Resource (14-27)
test/e2e/pkg/sourceclient_test.go (2)
pkg/client/cloudevents/grpcsource/client.go (1)
  • NewMaestroGRPCSourceWorkClient (19-64)
test/performance/pkg/util/util.go (1)
  • Eventually (30-49)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Red Hat Konflux / maestro-on-pull-request
  • GitHub Check: Red Hat Konflux / maestro-e2e-on-pull-request
  • GitHub Check: e2e-with-istio
  • GitHub Check: e2e
  • GitHub Check: e2e-grpc-broker
  • GitHub Check: e2e-broadcast-subscription
  • GitHub Check: upgrade
🔇 Additional comments (9)
test/e2e/pkg/sourceclient_test.go (3)

922-949: Note: Fragile string matching for replica checks.

A previous review flagged the string-based JSON matching at line 942 as potentially producing false positives (e.g., replicas=1 matching readyReplicas\":10). This concern remains valid.

Consider properly unmarshaling the JSON or using more precise matching patterns to avoid false positives in replica verification.


590-797: LGTM: Well-designed multi-source test scenario.

The test correctly validates that two independent source clients with different source IDs can monitor the same workload. The design was clarified in previous reviews—each client's operations are automatically scoped by their respective sourceID, enabling independent lifecycle management.


995-1048: LGTM: Read-only ManifestWork helper is well-structured.

The NewReadonlyWork helper correctly constructs a ManifestWork with read-only update strategy and feedback configuration for deployment monitoring.

docs/resources/cloudevents.status.maestro.json (1)

144-144: LGTM: Documentation updated consistently with PR objective.

The metadata name field now reflects the resource ID (55c61e54-a3f6-563d-9fec-b1fe297bdfdb), aligning with the PR's goal of using resource ID as the work metadata name. This change is consistent with updates in other documentation files.

docs/resources/cloudevents.spec.maestro.json (1)

77-77: LGTM: Documentation consistently updated.

The metadata name field correctly reflects the new resource ID-based naming approach, consistent with the PR objective and other documentation updates.

docs/resources/appliedmanifestwork.json (1)

14-14: LGTM: AppliedManifestWork documentation updated consistently.

Both the metadata name (line 14) and spec.manifestWorkName (line 21) correctly reference the new resource ID (55c61e54-a3f6-563d-9fec-b1fe297bdfdb), demonstrating the consistent naming approach introduced in this PR.

Also applies to: 21-21

cmd/maestro/server/grpc_broker.go (3)

26-26: LGTM: Good addition of codec support.

The new import and source constant enable the codec-based refactoring below. The constant is appropriately scoped at the package level for use within this file.

Also applies to: 32-32


295-296: LGTM: Cleaner decoding with codec.

Refactoring to use codec.Decode(evt) simplifies the status decoding logic and improves maintainability by delegating CloudEvent parsing to the codec layer.


307-308: LGTM: Cleaner encoding with codec.

Refactoring to use codec.Encode(source, eventType, resource) improves maintainability by delegating CloudEvent construction to the codec layer, which now handles metadata name synchronization as per the PR objective.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
test/e2e/pkg/sourceclient_test.go (2)

639-639: Typo: "fist" should be "first".

The typo persists in the current code.

🔎 Proposed fix
-			By("create fist work client", func() {
+			By("create first work client", func() {

935-946: Fragile JSON string matching can cause false positives.

The current check strings.Contains(string(feedbackJson), fmt.Sprintf(readyReplicas":%d, replicas)) can match unintended values. For example, checking for replicas=1 would also match readyReplicas":10, readyReplicas":100, etc.

🔎 Proposed fix using proper JSON unmarshaling
 	for _, manifest := range latestWatchedWork.Status.ResourceStatus.Manifests {
 		if meta.IsStatusConditionTrue(manifest.Conditions, "StatusFeedbackSynced") {
-			feedbackJson, err := json.Marshal(manifest.StatusFeedbacks)
-			if err != nil {
-				return err
-			}
-
-			if strings.Contains(string(feedbackJson), fmt.Sprintf(`readyReplicas\":%d`, replicas)) {
-				return nil
+			for _, value := range manifest.StatusFeedbacks.Values {
+				if value.Name == "readyReplicas" && value.Value.Integer != nil {
+					if *value.Value.Integer == int64(replicas) {
+						return nil
+					}
+				}
 			}
 		}
 	}

Alternatively, if using raw JSON parsing, ensure the match is exact:

-			if strings.Contains(string(feedbackJson), fmt.Sprintf(`readyReplicas\":%d`, replicas)) {
+			// Match readyReplicas with exact value (followed by comma, brace, or whitespace)
+			pattern := fmt.Sprintf(`"readyReplicas":%d[,}\s]`, replicas)
+			matched, _ := regexp.MatchString(pattern, string(feedbackJson))
+			if matched {
🧹 Nitpick comments (2)
test/e2e/pkg/sourceclient_test.go (1)

10-10: Unusual dependency: faker's slice package used only for Contains.

The faker library is primarily designed for generating fake test data, not for general utility functions. Using it just for slice.Contains (at lines 721, 725) introduces an unusual dependency.

Consider using slices.Contains from the standard library (slices package in Go 1.21+) or a simple loop:

🔎 Suggested alternatives

Option 1: Use standard library (Go 1.21+)

-	"github.com/bxcodec/faker/v3/support/slice"
+	"slices"

Then at usage sites:

-				if !slice.Contains(appliedWorkNames, firstWorkUID) {
+				if !slices.Contains(appliedWorkNames, firstWorkUID) {

Option 2: Use a simple helper function

func contains(slice []string, item string) bool {
    for _, s := range slice {
        if s == item {
            return true
        }
    }
    return false
}
pkg/client/cloudevents/codec.go (1)

120-135: Implementation is sound with one technically redundant but harmless assignment.

The helper function correctly handles the no-op case, validates types, and sets the metadata name to the resource ID.

The assignment at line 133 is technically redundant: maps in Go are pointers to runtime.hmap structures, so modifying metaObj["name"] already updates the underlying map in res.Payload. The reassignment has no additional effect and can be safely removed.

Defensively verify that res.ID is never empty when this function is called, though in normal usage this is unlikely since Resources are database entities with IDs always populated.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between f8ea7fb and 4ab0153.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (9)
  • cmd/maestro/server/grpc_broker.go (2 hunks)
  • docs/maestro.md (1 hunks)
  • docs/resources/appliedmanifestwork.json (1 hunks)
  • docs/resources/cloudevents.spec.maestro.json (1 hunks)
  • docs/resources/cloudevents.status.maestro.json (1 hunks)
  • pkg/client/cloudevents/codec.go (3 hunks)
  • pkg/client/cloudevents/codec_test.go (1 hunks)
  • test/e2e/pkg/sourceclient_test.go (4 hunks)
  • test/e2e/pkg/suite_test.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
  • docs/maestro.md
  • test/e2e/pkg/suite_test.go
  • docs/resources/cloudevents.status.maestro.json
  • cmd/maestro/server/grpc_broker.go
  • docs/resources/appliedmanifestwork.json
🧰 Additional context used
🧬 Code graph analysis (3)
pkg/client/cloudevents/codec_test.go (3)
pkg/client/cloudevents/codec.go (1)
  • NewCodec (24-28)
pkg/errors/errors.go (1)
  • New (102-116)
pkg/api/metadata_types.go (1)
  • Meta (51-56)
test/e2e/pkg/sourceclient_test.go (1)
test/performance/pkg/util/util.go (1)
  • Eventually (30-49)
pkg/client/cloudevents/codec.go (2)
pkg/api/resource_types.go (1)
  • Resource (14-27)
pkg/db/migrations/202311151850_add_resources.go (1)
  • Resource (11-25)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Red Hat Konflux / maestro-on-pull-request
  • GitHub Check: Red Hat Konflux / maestro-e2e-on-pull-request
  • GitHub Check: e2e
  • GitHub Check: upgrade
  • GitHub Check: e2e-grpc-broker
  • GitHub Check: e2e-with-istio
  • GitHub Check: e2e-broadcast-subscription
🔇 Additional comments (5)
docs/resources/cloudevents.spec.maestro.json (1)

77-77: LGTM! Documentation correctly reflects the new behavior.

The metadata.name field now matches the resourceid field (line 79), which correctly demonstrates the PR objective of using the resource ID as the work metadata name. The consistency across metadata.name, metadata.uid, and resourceid is appropriate.

test/e2e/pkg/sourceclient_test.go (2)

680-796: LGTM!

The test logic correctly validates that two independent source clients with different source IDs can operate on resources using the same work name, as each client's operations are internally scoped by their source ID. The verification of AppliedManifestWorks, replica status syncing, and independent deletion behavior is well-structured.


995-1048: LGTM!

The NewReadonlyWork helper is well-structured. It correctly:

  • Uses RawExtension.Object for type-safe manifest definition
  • Configures feedback rules to capture the full resource status via "@" JSONPath
  • Sets UpdateStrategyTypeReadOnly to monitor without modifying the deployment
pkg/client/cloudevents/codec.go (1)

35-39: LGTM! Clean integration of metadata name reset.

The pre-encode step to ensure payload metadata name matches the resource ID is well-implemented with proper error handling. This enforces uniqueness on the agent side as intended.

pkg/client/cloudevents/codec_test.go (1)

1-491: Excellent test coverage!

The test suite is comprehensive and well-structured:

  • ✅ Constructor and data type validation
  • ✅ Encoding scenarios (with/without metadata, deletion cases)
  • ✅ Metadata name reset verification (lines 138-144, 388-403)
  • ✅ Decoding with extensive error path coverage
  • ✅ Direct unit test of the resetPayloadMetadataNameWithResID helper
  • ✅ Full encode/decode roundtrip integration test

All previously identified issues from past reviews have been addressed (error handling, type assertions, validation logic). The tests provide strong confidence in the codec implementation.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
test/e2e/pkg/sourceclient_test.go (1)

922-949: LGTM with a note on string matching.

The AssertReplicas helper correctly searches for the latest watched work and validates the status feedback. The string matching approach on line 942 has been previously flagged as potentially fragile (could match readyReplicas":10 when checking for replicas=1), but the implementation works for the current test scenarios.

🧹 Nitpick comments (2)
test/e2e/pkg/sourceclient_test.go (1)

10-10: Consider using standard library or k8s utilities instead of faker/slice.

The slice.Contains function from the faker library (used on lines 721, 725) could be replaced with sets.Set from k8s.io/apimachinery/pkg/util/sets which is already imported and used elsewhere in the file, or a simple loop. This would reduce external dependencies.

🔎 Proposed refactor using sets.Set

Remove the import:

-	"github.com/bxcodec/faker/v3/support/slice"

Then refactor the checks on lines 721 and 725:

-					if !slice.Contains(appliedWorkNames, firstWorkUID) {
+					if !sets.New(appliedWorkNames...).Has(firstWorkUID) {
 						return fmt.Errorf("the first work %s is not found in %v", firstWorkUID, appliedWorkNames)
 					}

-					if !slice.Contains(appliedWorkNames, secondWorkUID) {
+					if !sets.New(appliedWorkNames...).Has(secondWorkUID) {
 						return fmt.Errorf("the second work %s is not found in %v", secondWorkUID, appliedWorkNames)
 					}
cmd/maestro/server/grpc_broker.go (1)

294-297: Consider using a package-level codec instance.

The function creates a new codec instance on every call. If the codec is stateless and lightweight, this is acceptable. However, if codec initialization has any overhead, consider creating a package-level codec instance to improve performance:

var codec = cloudevents.NewCodec(source)

Then reuse it in both decodeResourceStatus and encodeResourceSpec.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 4ab0153 and b03b4c1.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (9)
  • cmd/maestro/server/grpc_broker.go (2 hunks)
  • docs/maestro.md (1 hunks)
  • docs/resources/appliedmanifestwork.json (1 hunks)
  • docs/resources/cloudevents.spec.maestro.json (1 hunks)
  • docs/resources/cloudevents.status.maestro.json (1 hunks)
  • pkg/client/cloudevents/codec.go (3 hunks)
  • pkg/client/cloudevents/codec_test.go (1 hunks)
  • test/e2e/pkg/sourceclient_test.go (4 hunks)
  • test/e2e/pkg/suite_test.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (6)
  • test/e2e/pkg/suite_test.go
  • pkg/client/cloudevents/codec.go
  • docs/maestro.md
  • docs/resources/appliedmanifestwork.json
  • docs/resources/cloudevents.spec.maestro.json
  • pkg/client/cloudevents/codec_test.go
🧰 Additional context used
🧬 Code graph analysis (1)
test/e2e/pkg/sourceclient_test.go (2)
pkg/client/cloudevents/grpcsource/client.go (1)
  • NewMaestroGRPCSourceWorkClient (19-64)
test/performance/pkg/util/util.go (1)
  • Eventually (30-49)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Red Hat Konflux / maestro-on-pull-request
  • GitHub Check: Red Hat Konflux / maestro-e2e-on-pull-request
  • GitHub Check: upgrade
  • GitHub Check: e2e-with-istio
  • GitHub Check: e2e-grpc-broker
  • GitHub Check: e2e
  • GitHub Check: e2e-broadcast-subscription
🔇 Additional comments (5)
docs/resources/cloudevents.status.maestro.json (2)

144-144: LGTM! Metadata name correctly aligned with resource ID.

The metadata.name field now matches the resourceid (line 146) and metadata.uid, which correctly demonstrates the PR objective of using the resource ID as the work metadata name.


148-149: LGTM! New tracking fields added.

The addition of sequenceid and statushash fields enhances the CloudEvent status object with sequence tracking and status integrity verification capabilities.

test/e2e/pkg/sourceclient_test.go (1)

995-1048: LGTM! Well-structured ReadOnly work configuration.

The NewReadonlyWork helper correctly constructs a ManifestWork with:

  • Proper TypeMeta and ObjectMeta for the Deployment
  • ReadOnly update strategy to prevent modifications
  • Feedback rules configured to capture the full resource status via JSONPath "@"
  • Appropriate labels for test filtering

This configuration aligns well with the test objective of monitoring deployment status changes without modifying the deployment through the manifestwork.

cmd/maestro/server/grpc_broker.go (2)

32-32: LGTM: Good use of constant for source identifier.

Centralizing the source string as a package-level constant improves maintainability and consistency.


300-309: Resource ID metadata mapping is properly implemented; revise source parameter assessment.

The codec's Encode method correctly handles the PR objective. Lines 35-39 call resetPayloadMetadataNameWithResID(), which properly sets payload.metadata.name to the resource ID (line 132 in codec.go).

However, the source parameter usage is not redundant: The sourceID passed to NewCodec() is stored for use in the Decode() method (line 104 validates against codec.sourceID), while the source parameter in Encode() sets the actual event source (line 48). These serve different purposes and both are necessary.

The optional performance consideration about creating a new codec instance on each call remains valid but is low-priority given the codec's lightweight nature (single string field).

Signed-off-by: Wei Liu <liuweixa@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant