Skip to content

Conversation

@sinnykumari
Copy link
Contributor

No description provided.

Also, update related OpenShift deps to latest
@openshift-ci openshift-ci bot requested review from cheesesashimi and jkyros June 7, 2023 21:34
@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 7, 2023
@sinnykumari sinnykumari changed the title Update kube deps to 1.27.2 OCPBUGS-13656: MCO-632: Update kube deps to 1.27.2 Jun 7, 2023
@openshift-ci-robot openshift-ci-robot added jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Jun 7, 2023
@openshift-ci-robot
Copy link
Contributor

@sinnykumari: This pull request references Jira Issue OCPBUGS-13656, which is invalid:

  • expected the bug to target the "4.14.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@sinnykumari
Copy link
Contributor Author

/jira refresh

Putting hold for pre-merge testing by QE
/hold

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 7, 2023
@openshift-ci-robot openshift-ci-robot added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Jun 7, 2023
@openshift-ci-robot
Copy link
Contributor

@sinnykumari: This pull request references Jira Issue OCPBUGS-13656, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.14.0) matches configured target version for branch (4.14.0)
  • bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @sergiordlr

Details

In response to this:

/jira refresh

Putting hold for pre-merge testing by QE
/hold

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot requested a review from sergiordlr June 7, 2023 21:50
@sinnykumari
Copy link
Contributor Author

/retest

@sergiordlr
Copy link
Contributor

Verified using IPI on AWS

  1. Check that we can get the node-logs in master and worker nodes without the featuregates enabled
$ oc adm node-logs $(oc get nodes -l node-role.kubernetes.io/master -ojsonpath="{.items[0].metadata.name}")  | tail -5
Jun 08 10:06:51.148857 ip-10-0-136-156 kubenswrapper[1368]: I0608 10:06:51.148521    1368 kubelet_getters.go:187] "Pod status updated" pod="openshift-etcd/etcd-ip-10-0-136-156.us-east-2.compute.internal" status=Running
Jun 08 10:07:51.148906 ip-10-0-136-156 kubenswrapper[1368]: I0608 10:07:51.148874    1368 kubelet_getters.go:187] "Pod status updated" pod="openshift-kube-controller-manager/kube-controller-manager-ip-10-0-136-156.us-east-2.compute.internal" status=Running
Jun 08 10:07:51.149382 ip-10-0-136-156 kubenswrapper[1368]: I0608 10:07:51.148921    1368 kubelet_getters.go:187] "Pod status updated" pod="openshift-kube-apiserver/kube-apiserver-ip-10-0-136-156.us-east-2.compute.internal" status=Running
.....


$ oc adm node-logs $(oc get nodes -l node-role.kubernetes.io/worker -ojsonpath="{.items[0].metadata.name}")  | tail -5
Jun 08 10:02:23.417249 ip-10-0-145-69 systemd[1]: run-credentials-systemd\x2dtmpfiles\x2dclean.service.mount: Deactivated successfully.
Jun 08 10:02:30.055508 ip-10-0-145-69 crio[1345]: time="2023-06-08 10:02:30.055387967Z" level=info msg="Checking image status: registry.build05.ci.openshift.org/ci-ln-8yf0tk2/stable@sha256:ab7a0c78359f369ca9a01e867a8a2723756b523703fce8c34295febb8f974c35" id=eecc3960-8349-4dac-b69d-1fce1b76a99b name=/runtime.v1.ImageService/ImageStatus
Jun 08 10:02:30.062988 ip-10-0-145-69 crio[1345]: time="2023-06-08 10:02:30.055560484Z" level=info msg="Image status: &ImageStatusResponse{Image:&Image{Id:205756b08cfa99b14027530741212228160b8a47f22c111fb5bcc454718301df,RepoTags:[],RepoDigests:[registry.build05.ci.openshift.org/ci-ln-8yf0tk2/stable@sha256:ab7a0c78359f369ca9a01e867a8a2723756b523703fce8c34295febb8f974c35],Size_:367211971,Uid:&Int64Value{Value:0,},Username:,Spec:&ImageSpec{Image:,Annotations:map[string]string{},},Pinned:false,},Info:map[string]string{},}" id=eecc3960-8349-4dac-b69d-1fce1b76a99b name=/runtime.v1.ImageService/ImageStatus
.....
  1. Enable TechPreviewNoUpgrade
$ oc patch featuregate cluster --type=merge -p '{"spec":{"featureSet": "TechPreviewNoUpgrade"}}'
featuregate.config.openshift.io/cluster patched

  1. Wait for MCPs to be configured
$ oc get mcp
NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
master   rendered-master-c296af623760e67df77db133dd82b718   True      False      False      3              3                   3                     0                      51m
worker   rendered-worker-650bf887cac04c8c9dd8d3a024974bfc   True      False      False      3              3                   3                     0                      51m

  1. Check that we can get the node-logs after enabling TechPreviewNoUpgrade
$ oc adm node-logs $(oc get nodes -l node-role.kubernetes.io/worker -ojsonpath="{.items[0].metadata.name}")  | tail -3
Jun 08 10:33:30.123885 ip-10-0-145-69 systemd[1]: run-credentials-systemd\x2dtmpfiles\x2dclean.service.mount: Deactivated successfully.
Jun 08 10:33:35.546166 ip-10-0-145-69 crio[1375]: time="2023-06-08 10:33:35.546069638Z" level=info msg="Checking image status: registry.build05.ci.openshift.org/ci-ln-8yf0tk2/stable@sha256:ab7a0c78359f369ca9a01e867a8a2723756b523703fce8c34295febb8f974c35" id=f47bd51b-918b-438d-a936-3910979111c3 name=/runtime.v1.ImageService/ImageStatus
Jun 08 10:33:35.546562 ip-10-0-145-69 crio[1375]: time="2023-06-08 10:33:35.546256687Z" level=info msg="Image status: &ImageStatusResponse{Image:&Image{Id:205756b08cfa99b14027530741212228160b8a47f22c111fb5bcc454718301df,RepoTags:[],RepoDigests:[registry.build05.ci.openshift.org/ci-ln-8yf0tk2/stable@sha256:ab7a0c78359f369ca9a01e867a8a2723756b523703fce8c34295febb8f974c35],Size_:367211971,Uid:&Int64Value{Value:0,},Username:,Spec:&ImageSpec{Image:,Annotations:map[string]string{},},Pinned:false,},Info:map[string]string{},}" id=f47bd51b-918b-438d-a936-3910979111c3 name=/runtime.v1.ImageService/ImageStatus


$ oc adm node-logs $(oc get nodes -l node-role.kubernetes.io/master -ojsonpath="{.items[0].metadata.name}")  | tail -3
Jun 08 10:34:16.666224 ip-10-0-136-156 kubenswrapper[1399]: I0608 10:34:16.665899    1399 kubelet_getters.go:187] "Pod status updated" pod="openshift-kube-scheduler/openshift-kube-scheduler-ip-10-0-136-156.us-east-2.compute.internal" status=Running
Jun 08 10:34:16.806584 ip-10-0-136-156 crio[1366]: time="2023-06-08 10:34:16.806517805Z" level=info msg="Checking image status: registry.build05.ci.openshift.org/ci-ln-8yf0tk2/stable@sha256:ab7a0c78359f369ca9a01e867a8a2723756b523703fce8c34295febb8f974c35" id=bf0629a5-8011-4d54-97d9-835aa86e002e name=/runtime.v1.ImageService/ImageStatus
Jun 08 10:34:16.806987 ip-10-0-136-156 crio[1366]: time="2023-06-08 10:34:16.806720099Z" level=info msg="Image status: &ImageStatusResponse{Image:&Image{Id:205756b08cfa99b14027530741212228160b8a47f22c111fb5bcc454718301df,RepoTags:[],RepoDigests:[registry.build05.ci.openshift.org/ci-ln-8yf0tk2/stable@sha256:ab7a0c78359f369ca9a01e867a8a2723756b523703fce8c34295febb8f974c35],Size_:367211971,Uid:&Int64Value{Value:0,},Username:,Spec:&ImageSpec{Image:,Annotations:map[string]string{},},Pinned:false,},Info:map[string]string{},}" id=bf0629a5-8011-4d54-97d9-835aa86e002e name=/runtime.v1.ImageService/ImageStatus

We add the qe-approved label

/label qe-approved

@openshift-ci openshift-ci bot added the qe-approved Signifies that QE has signed off on this PR label Jun 8, 2023
@sinnykumari
Copy link
Contributor Author

@JoelSpeed does this look ok to you, mainly 1039029

@aravindhp
Copy link
Contributor

@sergiordlr Do TechPreviewNoUpgrade clusters have the enableSystemLogQuery feature enabled in the kubelet config?

@sergiordlr
Copy link
Contributor

sergiordlr commented Jun 8, 2023

@aravindhp Yes, enableSystemLogQuery is enabled in the kubelet.conf

$ oc debug node/ip-10-0-145-69.us-east-2.compute.internal -- chroot /host cat /etc/kubernetes/kubelet.conf |grep enableSystemLogQuery
Starting pod/ip-10-0-145-69us-east-2computeinternal-debug ...
To use host binaries, run `chroot /host`
  "enableSystemLogQuery": true,

Removing debug pod ...

$ oc get featuregate -o yaml |grep Tech
    featureSet: TechPreviewNoUpgrade

@aravindhp
Copy link
Contributor

🎉 Thanks for confirming @sergiordlr

@aravindhp
Copy link
Contributor

/retest-required

@sinnykumari
Copy link
Contributor Author

qe-verified, removing hold
/hold cancel

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 8, 2023
@sinnykumari
Copy link
Contributor Author

e2e-hypershift is broken at the moment, internal slcak thread

@aravindhp
Copy link
Contributor

Are we waiting for the issue to be fixed or override the job?

Comment on lines 26 to +29
staticPodPath: /etc/kubernetes/manifests
systemCgroups: /system.slice
featureGates:
AlibabaPlatform: true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is might be a dumb question, but if we're processing feature gates properly, and Joel already added it to the defaults e.g.

alibabaPlatform, // This is a bug, it should be TechPreviewNoUpgrade. This must be downgraded before 4.14 is shipped.
which is vendored as part of this, what is the intent behind also including it here?

(Is it just "make sure that if we don't have our featuregate ducks in a row that AlibabaPlatform doesn't get shut off anywhere yet" ? )

Copy link
Contributor Author

@sinnykumari sinnykumari Jun 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good question, @rphillips might know the answer of how it is all wired up. I needed to add it as kubelet TestFeatureGateDrift test was failing with recent change you have mentioned which was part of latest PR openshift/api#1478

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading from original PR #187 when featureGate field was added, it seems like these fields get used during initial bootstarpping process and help kubelet decide which features to turn on.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jkyros All featureGates mentioned in kubelet template matches what is defined as defaultFeatures in https://github.com/openshift/api/blob/master/config/v1/types_feature.go#L191 . Do you see any concern here with adding alibabaPlatform to match with it ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the issues with these tests, IIUC, is that they have a statically coded list of features, which needs updating every time the default feature set changes. That is going to happen often! It would be better if we can update the tests/templates so that there is no checked in default set of features, else this will need constant updates, but I'm not sure how possible that is

"make sure that if we don't have our featuregate ducks in a row that AlibabaPlatform doesn't get shut off anywhere yet"

Yes, the intention is to move all code for alibaba behind a feature gate so that we can then switch it off with a single PR and prove that nothing breaks by doing so

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 8, 2023

@sinnykumari: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/okd-scos-e2e-aws-ovn 1039029 link false /test okd-scos-e2e-aws-ovn

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@sinnykumari
Copy link
Contributor Author

@aravindhp we are going with override. We can do that once PR has lgtm.

@sinnykumari
Copy link
Contributor Author

overriding e2e-hypershift as it is broken, see openshift/hypershift#2664
/override e2e-hypershift

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 9, 2023

@sinnykumari: /override requires failed status contexts, check run or a prowjob name to operate on.
The following unknown contexts/checkruns were given:

  • e2e-hypershift

Only the following failed contexts/checkruns were expected:

  • ci/prow/bootstrap-unit
  • ci/prow/e2e-aws-ovn
  • ci/prow/e2e-aws-ovn-upgrade
  • ci/prow/e2e-gcp-op
  • ci/prow/e2e-gcp-op-single-node
  • ci/prow/e2e-hypershift
  • ci/prow/e2e-openstack
  • ci/prow/images
  • ci/prow/okd-images
  • ci/prow/okd-scos-e2e-aws-ovn
  • ci/prow/okd-scos-images
  • ci/prow/unit
  • ci/prow/verify
  • pull-ci-openshift-machine-config-operator-master-bootstrap-unit
  • pull-ci-openshift-machine-config-operator-master-e2e-aws-ovn
  • pull-ci-openshift-machine-config-operator-master-e2e-aws-ovn-upgrade
  • pull-ci-openshift-machine-config-operator-master-e2e-gcp-op
  • pull-ci-openshift-machine-config-operator-master-e2e-gcp-op-single-node
  • pull-ci-openshift-machine-config-operator-master-e2e-hypershift
  • pull-ci-openshift-machine-config-operator-master-e2e-openstack
  • pull-ci-openshift-machine-config-operator-master-images
  • pull-ci-openshift-machine-config-operator-master-okd-images
  • pull-ci-openshift-machine-config-operator-master-okd-scos-e2e-aws-ovn
  • pull-ci-openshift-machine-config-operator-master-okd-scos-images
  • pull-ci-openshift-machine-config-operator-master-unit
  • pull-ci-openshift-machine-config-operator-master-verify
  • tide

If you are trying to override a checkrun that has a space in it, you must put a double quote on the context.

Details

In response to this:

overriding e2e-hypershift as it is broken, see openshift/hypershift#2664
/override e2e-hypershift

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@sinnykumari
Copy link
Contributor Author

/override ci/prow/e2e-hypershift

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 9, 2023

@sinnykumari: Overrode contexts on behalf of sinnykumari: ci/prow/e2e-hypershift

Details

In response to this:

/override ci/prow/e2e-hypershift

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@jkyros
Copy link
Member

jkyros commented Jun 9, 2023

I don't want to hold this up just so I can understand what that test should be checking. Seeing as alibaba wasn't feature-gated before at all, we should be no worse than we were before this. 😄
/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jun 9, 2023
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 9, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jkyros, sinnykumari

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-robot openshift-merge-robot merged commit 6511a6d into openshift:master Jun 9, 2023
@openshift-ci-robot
Copy link
Contributor

@sinnykumari: Jira Issue OCPBUGS-13656: All pull requests linked via external trackers have merged:

Jira Issue OCPBUGS-13656 has been moved to the MODIFIED state.

Details

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. qe-approved Signifies that QE has signed off on this PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants