Skip to content

Conversation

@BraeTroutman
Copy link
Contributor

@BraeTroutman BraeTroutman commented Nov 18, 2025

What type of PR is this?

feature

What this PR does / why we need it?

AWS is legally responsible for having limited and accurate vCPU allocation for Microsoft License-Included customers on AWS EC2. Having a misallocation of vCPU would result in a breach of agreement between AWS and Microsoft for License-Included Windows software. This change deploys VAP on HCPs that prevents the overcommit of vCPUs for windows VMs.

This change makes use of a dedicated ManagedCluster label that is only present on clusters with Windows LI enabled. That means all existing customer workloads will not be impacted by this change.

Which Jira/Github issue(s) this PR fixes?

Fixes #CNV-56734 #OCM-20819

Special notes for your reviewer:

Demonstration of local testing/procedure for stage validation here

Pre-checks (if applicable):

  • Tested latest changes against a cluster

  • Included documentation changes with PR

  • If this is a new object that is not intended for the FedRAMP environment (if unsure, please reach out to team FedRAMP), please exclude it with:

    matchExpressions:
    - key: api.openshift.com/fedramp
      operator: NotIn
      values: ["true"]

@openshift-ci openshift-ci bot requested review from abyrne55 and bng0y November 18, 2025 18:34
@abyrne55
Copy link
Contributor

/retest

@abyrne55
Copy link
Contributor

For other reviewers: here's what Claude said when I asked it to compare this PR against #2535 (the previous attempt):

Summary of Differences Between PR 2535 and PR 2588

Based on my analysis, here are the key differences between PR 2588 (open) and PR 2535 (merged then reverted):

Main Functional Change:

1. Additional cluster selector for Windows License-Included clusters

  • PR 2588 adds a new cluster selector requirement: api.openshift.com/win-li: "true"
  • This appears in multiple files:
    • deploy/srep-vap/vcpu-overcommit/config.yaml
    • deploy/acm-policies/50-GENERATED-srep-vap-vcpu-overcommit.Policy.yaml
    • All three hack template files (integration, production, stage)

Minor Changes:

2. Whitespace cleanup

  • PR 2588 removes a trailing blank line in 101-windows-11-vcpu-restrict.yaml (69 lines → 68 lines)

3. Different approach in scripts/generate-policy-config.py

  • PR 2535: Simply added "srep-vap/vcpu-overcommit" to the existing directories list
  • PR 2588: Commented out ALL existing directories and only includes "srep-vap/vcpu-overcommit"
    • This is a significant testing/development change that likely shouldn't be in production

Impact Analysis:

The most important difference is the addition of the api.openshift.com/win-li label selector. This means:

  • PR 2535: Would deploy to all HCP clusters (except 4.14-4.18)
  • PR 2588: Will only deploy to HCP clusters that also have the Windows License-Included label

The scripts/generate-policy-config.py change in PR 2588 appears to be an unintended development/testing change where all other directories were commented out. This should likely be corrected before merging.

@BraeTroutman can you address the scripts/generate-policy-config.py issue? Can you also provide steps on how to test this PR in stage, so we can confirm that this won't cause an issue with existing Windows VMs in production?

@BraeTroutman BraeTroutman force-pushed the OCM-20819/vcpu-overcommit-vap branch 3 times, most recently from 492caff to c92f4e1 Compare November 21, 2025 20:15
@BraeTroutman BraeTroutman force-pushed the OCM-20819/vcpu-overcommit-vap branch 2 times, most recently from fc88a88 to ddecd42 Compare December 15, 2025 20:48
AWS is legally responsible for having limited and accurate vCPU
allocation for Microsoft License-Included customers on AWS EC2. Having a
misallocation of vCPU would result in a breach of agreement between AWS
and Microsoft for License-Included Windows software. This change deploys
VAP on HCPs that prevents the overcommit of vCPUs for windows VMs.
@BraeTroutman BraeTroutman force-pushed the OCM-20819/vcpu-overcommit-vap branch from ddecd42 to 23b1cea Compare December 16, 2025 18:07
@feichashao
Copy link
Contributor

LGTM for the VAP and VAP bindings from IAM perspective. It won't affect customer without WinLI label, and the VAP test cases make sense.

We would better have another eye from virtualization experts regarding the VirtualMachineClusterPreference files, then I think we can merge.

@BraeTroutman for the VirtualMachineClusterPreference files, usually we put those resources files in a dedicated folder under /deploy (maybe you can create one for virt), if they are not hard dependencies for the VAP. The intention is to keep this VAP folder focus on VAPs. But it is OK if it is temporary or highly depended by VAP.

Copy link
Contributor

@jcanocan jcanocan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking care of this yourself! Excelent job!

@BraeTroutman
Copy link
Contributor Author

LGTM for the VAP and VAP bindings from IAM perspective. It won't affect customer without WinLI label, and the VAP test cases make sense.

We would better have another eye from virtualization experts regarding the VirtualMachineClusterPreference files, then I think we can merge.

@BraeTroutman for the VirtualMachineClusterPreference files, usually we put those resources files in a dedicated folder under /deploy (maybe you can create one for virt), if they are not hard dependencies for the VAP. The intention is to keep this VAP folder focus on VAPs. But it is OK if it is temporary or highly depended by VAP.

@feichashao At the moment these preferences are tightly coupled to the validating admission policies, but are temporary and will be removed once the preferences are shipped by default with CNV's next release. If it's not too much trouble, it's ideal that the VM Preference definitions remain with the VAP 🙏

@BraeTroutman BraeTroutman force-pushed the OCM-20819/vcpu-overcommit-vap branch 3 times, most recently from 87a73f5 to a6b6cd2 Compare December 17, 2025 15:22
@feichashao
Copy link
Contributor

LGTM from IAM domain perspective.

Need approval from TL/FL.

@jcanocan
Copy link
Contributor

LGTM

@BraeTroutman BraeTroutman force-pushed the OCM-20819/vcpu-overcommit-vap branch 2 times, most recently from 58cabe8 to 18899a4 Compare December 19, 2025 21:24
@abyrne55
Copy link
Contributor

This looks great @BraeTroutman. Please just update the README so there's no confusion about the test cases (here's a Claude-generated fix that you're welcome to use: README.md)

We're going to hold this during the change freeze, but I'm +1 on merging afterwards.

/hold

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 22, 2025
@abyrne55
Copy link
Contributor

/approve

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 22, 2025
Co-authored-by: Javier Cano Cano <jcanocan@redhat.com>
@BraeTroutman BraeTroutman force-pushed the OCM-20819/vcpu-overcommit-vap branch from 18899a4 to ed89fbf Compare December 22, 2025 20:00
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 22, 2025

@BraeTroutman: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@abyrne55
Copy link
Contributor

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Dec 22, 2025
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 22, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: abyrne55, BraeTroutman

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants