-
Notifications
You must be signed in to change notification settings - Fork 16
NETOBSERV-2443 fix bug, improve cleanup and writing files #404
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #404 +/- ##
==========================================
- Coverage 13.84% 13.54% -0.30%
==========================================
Files 18 18
Lines 2731 2326 -405
==========================================
- Hits 378 315 -63
+ Misses 2329 1987 -342
Partials 24 24
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
|
/test ? |
|
@memodi: The following commands are available to trigger required jobs: Use DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/test integration-tests |
|
/test integration-tests |
|
integration tests are failing because for some reason CI cluster is taking too long to pull images. /test integration-tests |
|
/test integration-tests |
1 similar comment
|
/test integration-tests |
- Increase waitDaemonset timeout from 50s to 5 minutes (30×10s) * CI environments often have slow image pulls * Previous timeout was too aggressive for registry operations - Add comprehensive diagnostic output on pod startup failure: * Pod status with node placement (get pods -o wide) * Recent events to identify ImagePullBackOff, etc * Pod event details from describe output * Daemonset logs if containers started This helps diagnose ContainerCreating issues in CI where pods fail to start due to image pull problems or resource constraints. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
CI runs showed 4/6 pods ready with 5 minute timeout, indicating image pulls need more time. Increasing to 10 minutes (60×10s) to accommodate slower CI registry pulls and pod scheduling. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
In E2E test mode, the bash script's waitDaemonset() could exit with error after 10 minutes while the Go test's isDaemonsetReady() was still polling. This created a race where: 1. Go test calls StartCommand() which runs bash script async 2. Bash script calls waitDaemonset() and waits 10 mins 3. Go test calls isDaemonsetReady() and waits 10 mins 4. If bash times out first, it calls exit 1, killing the process 5. Go test is left polling a dead command Solution: When isE2E=true, skip the bash-level wait since the Go test framework handles pod readiness checking via isDaemonsetReady(). For manual CLI usage (isE2E=false), the wait still runs as before to provide user feedback. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Tests were failing because: 1. Commands ran with --max-time=1m in foreground mode 2. After 1 minute, capture finished and auto-cleanup ran 3. Cleanup deleted the daemonset 4. isDaemonsetReady() was polling for a deleted daemonset 5. Test failed with context deadline exceeded Using --background mode prevents automatic cleanup when the capture finishes, allowing the test to verify daemonset privilege settings before cleanup runs. Also, Check for CLI is running instead of just daemnset. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
|
/test integration-tests |
|
/test integration-tests |
|
/test integration-tests |
|
/needs-review |
|
@jpinsonneau - any idea why e2e tests are failing? |
If we expect
working locally: ceb7f6a |
|
LGTM |
|
/lgtm |
|
@kapjain-rh: changing LGTM is restricted to collaborators DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/lgtm |
|
@jpinsonneau - is this okay to merge? Not sure if you had chance to review. |
jpinsonneau
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That looks good to me ! Thanks @memodi !
|
[APPROVALNOTIFIER] This PR is APPROVED Approval requirements bypassed by manually added approval. This pull-request has been approved by: The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
Description
NETOBSERV-2443 fix bug, improve cleanup and writing files
With the help of Claude, I was able to identify the flakiness coming from pty and made bunch of improvements as below:
Made several runs, now CLI tests are much stable.
Dependencies
n/a
Checklist
If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.