-
Notifications
You must be signed in to change notification settings - Fork 52
OCPCLOUD-3357: Tests for networkpolicies in capi #456
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Pipeline controller notification For optional jobs, comment This repository is configured in: LGTM mode |
|
@miyadav: This pull request references OCPCLOUD-3357 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
📝 WalkthroughWalkthroughAdds a new end-to-end Go test suite (Ginkgo/Gomega) that validates Kubernetes NetworkPolicy, service exposure, pod/container ports, and Prometheus metrics accessibility for OpenShift Cluster API components across Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
No actionable comments were generated in the recent review. 🎉 Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (2.5.0)Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions Comment |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
@miyadav: This pull request references OCPCLOUD-3357 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In `@e2e/network_policy_test.go`:
- Around line 390-397: In the webhook port loop in e2e/network_policy_test.go
(the block that sets hasWebhookPort and iterates over webhookPorts), avoid
directly accessing port.Port.IntVal because it can be zero for named ports;
instead check port.Port.Type (or use port.Port.IntValue()) before comparing to
webhookPort or 443 and keep the same protocol assertions
(Expect(port.Protocol).ToNot(BeNil()) and
Expect(*port.Protocol).To(Equal(corev1.ProtocolTCP))). Update the conditional
that currently uses port.Port.IntVal to first ensure the value is an int (or
call IntValue()) so named ports don’t produce false negatives.
- Around line 273-286: The test currently only logs a message (By(...)) and
never verifies connectivity; update the test in e2e/network_policy_test.go to
actually assert reachability of the metrics endpoint: after obtaining podList,
targetPod, targetPodIP and metricsPort, exec a command from the test pod (or a
dedicated test helper pod) to curl or wget
http://<targetPodIP>:<metricsPort>/metrics and assert a successful HTTP response
(0 exit code and non-empty body) rather than just using By(); refer to the
existing podList, targetPod, targetPodIP and metricsPort symbols and add the
exec call and Expect assertions to fail the test if scraping is blocked.
- Around line 84-93: The test can miss named ports because it only checks
port.Port.IntVal; update the check inside the loop that iterates
metricsControllersPolicy.Spec.Ingress to handle intstr.IntOrString properly
(e.g., call port.Port.IntValue() or first check port.Port.Type == intstr.Int
before using IntVal) and compare that result to metricsPort so hasMetricsPort
correctly becomes true for both numeric and named-port representations.
🧹 Nitpick comments (4)
e2e/network_policy_test.go (4)
170-198: Consider extracting repeated port-checking logic into a helper.The same pattern of iterating over container ports to find a specific port and name is repeated multiple times (lines 171-177, 181-187, 191-197, and again in the migration container section). This could be simplified with a helper function.
Example helper function
// Add this helper function at package level or within the test func hasContainerPort(container *corev1.Container, port int32, name string) bool { for _, p := range container.Ports { if p.ContainerPort == port && p.Name == name { return true } } return false }Then simplify checks:
- By("Verifying capi-controllers container exposes diagnostics port 8443") - hasMetricsPortInContainer := false - for _, port := range capiControllersContainer.Ports { - if port.ContainerPort == metricsPort && port.Name == "diagnostics-o" { - hasMetricsPortInContainer = true - break - } - } - Expect(hasMetricsPortInContainer).To(BeTrue(), "capi-controllers container should expose diagnostics port 8443") + By("Verifying capi-controllers container exposes diagnostics port 8443") + Expect(hasContainerPort(capiControllersContainer, metricsPort, "diagnostics-o")).To(BeTrue(), + "capi-controllers container should expose diagnostics port 8443")
241-244: Consider using a pinned image tag instead of:latest.Using
:latestfor the test pod image (registry.access.redhat.com/ubi9/ubi-minimal:latest) can lead to non-reproducible test results if the image changes. Consider pinning to a specific version or digest for deterministic behavior.
251-254: Simplify error handling for pod creation.The current pattern is functional but could be clearer.
Proposed simplification
err := cl.Create(ctx, testPod) - if err != nil && !apierrors.IsAlreadyExists(err) { - Expect(err).ToNot(HaveOccurred()) - } + Expect(client.IgnoreAlreadyExists(err)).ToNot(HaveOccurred())
client.IgnoreAlreadyExistsfrom controller-runtime returns nil if the error is AlreadyExists, making the intent clearer.
353-360: Brittle assertion on exact number of ingress rules.The assertion
Expect(policy.Spec.Ingress).To(HaveLen(1))will fail if the policy is extended with additional ingress rules, even if the required port is still correctly configured. Consider checking that the expected port exists among the rules rather than asserting an exact count.More resilient approach
- Expect(policy.Spec.Ingress).To(HaveLen(1)) - Expect(policy.Spec.Ingress[0].Ports).ToNot(BeEmpty()) - - port := policy.Spec.Ingress[0].Ports[0] - Expect(port.Protocol).ToNot(BeNil()) - Expect(*port.Protocol).To(Equal(corev1.ProtocolTCP)) - Expect(port.Port).ToNot(BeNil()) - Expect(*port.Port).To(Equal(intstr.FromInt(metricsPort))) + Expect(policy.Spec.Ingress).ToNot(BeEmpty()) + + foundPort := false + for _, ingress := range policy.Spec.Ingress { + for _, port := range ingress.Ports { + if port.Port != nil && *port.Port == intstr.FromInt(metricsPort) { + Expect(port.Protocol).ToNot(BeNil()) + Expect(*port.Protocol).To(Equal(corev1.ProtocolTCP)) + foundPort = true + break + } + } + } + Expect(foundPort).To(BeTrue(), "Policy should have metrics port configured")
| hasMetricsPort := false | ||
| for _, ingress := range metricsControllersPolicy.Spec.Ingress { | ||
| for _, port := range ingress.Ports { | ||
| if port.Port != nil && port.Port.IntVal == metricsPort { | ||
| hasMetricsPort = true | ||
| break | ||
| } | ||
| } | ||
| } | ||
| Expect(hasMetricsPort).To(BeTrue(), "NetworkPolicy should allow ingress on port 8443") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential false negative when port is specified as a string.
intstr.IntOrString can hold either an integer (IntVal) or a string (StrVal). If the NetworkPolicy specifies the port as a named port string, port.Port.IntVal will be 0 even though port.Port != nil. This could cause the test to incorrectly report that the metrics port is not configured.
Proposed fix to handle both int and string port types
hasMetricsPort := false
for _, ingress := range metricsControllersPolicy.Spec.Ingress {
for _, port := range ingress.Ports {
- if port.Port != nil && port.Port.IntVal == metricsPort {
+ if port.Port != nil && port.Port.IntValue() == int(metricsPort) {
hasMetricsPort = true
break
}
}
}Note: IntValue() returns the integer value regardless of whether the underlying type is int or string (returns 0 for non-numeric strings). Alternatively, explicitly check port.Port.Type == intstr.Int before accessing IntVal.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| hasMetricsPort := false | |
| for _, ingress := range metricsControllersPolicy.Spec.Ingress { | |
| for _, port := range ingress.Ports { | |
| if port.Port != nil && port.Port.IntVal == metricsPort { | |
| hasMetricsPort = true | |
| break | |
| } | |
| } | |
| } | |
| Expect(hasMetricsPort).To(BeTrue(), "NetworkPolicy should allow ingress on port 8443") | |
| hasMetricsPort := false | |
| for _, ingress := range metricsControllersPolicy.Spec.Ingress { | |
| for _, port := range ingress.Ports { | |
| if port.Port != nil && port.Port.IntValue() == int(metricsPort) { | |
| hasMetricsPort = true | |
| break | |
| } | |
| } | |
| } | |
| Expect(hasMetricsPort).To(BeTrue(), "NetworkPolicy should allow ingress on port 8443") |
🤖 Prompt for AI Agents
In `@e2e/network_policy_test.go` around lines 84 - 93, The test can miss named
ports because it only checks port.Port.IntVal; update the check inside the loop
that iterates metricsControllersPolicy.Spec.Ingress to handle intstr.IntOrString
properly (e.g., call port.Port.IntValue() or first check port.Port.Type ==
intstr.Int before using IntVal) and compare that result to metricsPort so
hasMetricsPort correctly becomes true for both numeric and named-port
representations.
| By("Verifying network policy allows access to metrics endpoints") | ||
| podList := &corev1.PodList{} | ||
| Expect(cl.List(ctx, podList, client.InNamespace(capiNamespace), client.MatchingLabels{ | ||
| capiControllersLabel: capiControllersValue, | ||
| })).To(Succeed()) | ||
|
|
||
| Expect(podList.Items).ToNot(BeEmpty(), "capi-controllers pods should exist for testing") | ||
|
|
||
| targetPod := podList.Items[0] | ||
| targetPodIP := targetPod.Status.PodIP | ||
| Expect(targetPodIP).ToNot(BeEmpty(), "Target pod should have an IP address") | ||
|
|
||
| By(fmt.Sprintf("Network policy configuration allows metrics scraping from pod %s at %s:%d", targetPod.Name, targetPodIP, metricsPort)) | ||
| }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test does not actually verify metrics endpoint accessibility.
The test creates a pod and retrieves the target pod's IP but never actually verifies that the test pod can reach the metrics endpoint. The final By() statement just logs a message without performing any connectivity check. This makes the test name misleading.
Consider either:
- Actually testing connectivity (e.g., exec into the test pod and curl the metrics endpoint)
- Renaming the test to reflect what it actually validates (policy configuration exists)
- Adding a comment explaining that this is a configuration validation only
🤖 Prompt for AI Agents
In `@e2e/network_policy_test.go` around lines 273 - 286, The test currently only
logs a message (By(...)) and never verifies connectivity; update the test in
e2e/network_policy_test.go to actually assert reachability of the metrics
endpoint: after obtaining podList, targetPod, targetPodIP and metricsPort, exec
a command from the test pod (or a dedicated test helper pod) to curl or wget
http://<targetPodIP>:<metricsPort>/metrics and assert a successful HTTP response
(0 exit code and non-empty body) rather than just using By(); refer to the
existing podList, targetPod, targetPodIP and metricsPort symbols and add the
exec call and Expect assertions to fail the test if scraping is blocked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You raise a good point, CodeRabbit. I was wondering this myself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mdbooth Thank you for confirming! Since this test could be more robust, I'd be happy to help plan the implementation of an actual connectivity check if the team decides to address it.
The fix would involve:
- Using the Kubernetes client's
CoreV1().Pods().Exec()or a similar mechanism to run a command inside the test pod - Executing a curl/wget command to hit
http://<targetPodIP>:<metricsPort>/metrics - Adding an
Expect()assertion on the exit code and response to ensure the metrics endpoint is actually reachable
Just let me know if you'd like me to help plan the code changes for this! 🐰
| hasWebhookPort := false | ||
| for _, port := range webhookPorts { | ||
| if port.Port != nil && (port.Port.IntVal == webhookPort || port.Port.IntVal == 443) { | ||
| hasWebhookPort = true | ||
| Expect(port.Protocol).ToNot(BeNil()) | ||
| Expect(*port.Protocol).To(Equal(corev1.ProtocolTCP)) | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same IntVal access pattern issue.
Similar to the earlier comment, port.Port.IntVal may be 0 if the port is specified as a named string port. Consider using port.Port.IntValue() or checking port.Port.Type first.
🤖 Prompt for AI Agents
In `@e2e/network_policy_test.go` around lines 390 - 397, In the webhook port loop
in e2e/network_policy_test.go (the block that sets hasWebhookPort and iterates
over webhookPorts), avoid directly accessing port.Port.IntVal because it can be
zero for named ports; instead check port.Port.Type (or use port.Port.IntValue())
before comparing to webhookPort or 443 and keep the same protocol assertions
(Expect(port.Protocol).ToNot(BeNil()) and
Expect(*port.Protocol).To(Equal(corev1.ProtocolTCP))). Update the conditional
that currently uses port.Port.IntVal to first ensure the value is an int (or
call IntValue()) so named ports don’t produce false negatives.
|
/test unit |
|
One of the test was getting skipped due to label updated label - |
|
@miyadav: This pull request references OCPCLOUD-3357 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@miyadav: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
/assign @mdbooth |
mdbooth
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This contains a number of what I call 'diff tests', where the test is essentially just asserting that the code is exactly what it is. Do we normally write these? I have only ever found them counter productive because:
- They don't actually tell if the code works
- But they still trip you up when you modify the code
So... negative value imho 😬
I was hoping to see something that directly tested the functionality. E.g.
- Confirm that connections that the NetworkPolicy intends to allow are permitted
- Confirm that connections that the NetworkPolicy does not allow are not permitted
- Confirm that metrics are showing up in the place that they're supposed to
That last one feels like it might live in the step registry? Not sure about that.
| By("Verifying network policy allows access to metrics endpoints") | ||
| podList := &corev1.PodList{} | ||
| Expect(cl.List(ctx, podList, client.InNamespace(capiNamespace), client.MatchingLabels{ | ||
| capiControllersLabel: capiControllersValue, | ||
| })).To(Succeed()) | ||
|
|
||
| Expect(podList.Items).ToNot(BeEmpty(), "capi-controllers pods should exist for testing") | ||
|
|
||
| targetPod := podList.Items[0] | ||
| targetPodIP := targetPod.Status.PodIP | ||
| Expect(targetPodIP).ToNot(BeEmpty(), "Target pod should have an IP address") | ||
|
|
||
| By(fmt.Sprintf("Network policy configuration allows metrics scraping from pod %s at %s:%d", targetPod.Name, targetPodIP, metricsPort)) | ||
| }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You raise a good point, CodeRabbit. I was wondering this myself.
| }) | ||
|
|
||
| Context("in openshift-cluster-api namespace", func() { | ||
| It("should have network policies with correct labels", func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reading this test, it looks like it's just checking that the policy exists and has the properties defined in the manifest. That's basically a test of CVO. Do we normally test stuff like this?
My experience of this kind of test is that it just trips you up when renaming something, without ever providing any useful testing signal.
|
|
||
| }) | ||
|
|
||
| It("should have services exposing all metrics ports", func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the same as above, except for the Service. This isn't testing any functionality, just that the Service created matches what's in the manifest. That's CVO's job. I wouldn't test this. It doesn't tell me anything I need to know, but it does create additional work if I need to modify the manifests.
| } | ||
| }) | ||
|
|
||
| It("should allow Prometheus to access metrics endpoints", func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As CodeRabbit points out, this appears to be missing the actual test. I was expecting to see us create a pod and then perhaps:
- Confirm that the metrics endpoint is accessible from the same namespace
- Confirm that the metrics endpoint is accessible from another namespace
- Confirm that some other port is not exposed outside the namespace?
I haven't thought deeply about these, btw. You may come up with more relevant tests.
| }) | ||
| }) | ||
|
|
||
| Context("in openshift-cluster-api-operator namespace", func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, this is just checking the names of things. In this case that a deployed Pod has the properties which were defined in its Deployment. This is testing CVO and kube-controller-manager. I don't think this test lives here.
| Expect(hasHealthPort).To(BeTrue(), "capi-operator container should expose health port 9440") | ||
| }) | ||
|
|
||
| It("should have cluster monitoring enabled", func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any way we can test this directly instead? e.g. by ensuring that metrics from this namespace actually show up in cluster monitoring?
| }) | ||
|
|
||
| Context("NetworkPolicy port specifications", func() { | ||
| It("should have correct port configurations in network policies", func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, this is just testing that the network policy was created as defined. It doesn't test any resulting functionality. I would not include this test.
Added tests for networkpolicies for capi namespaces.
@sunzhaohua2 @huali9 PTAL when time permits
cc @mdbooth @damdo
Summary by CodeRabbit