-
Notifications
You must be signed in to change notification settings - Fork 441
Cache KubeVirt Boot Image #1918
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cache KubeVirt Boot Image #1918
Conversation
|
Skipping CI for Draft Pull Request. |
davidvossel
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been debating on whether this caching functionality belongs in the nodepool operator, or whether it fits better into capk itself.
I'm leaning towards thinking the current NodePool approach is a better fit. We're essentially doing some namespace scoped golden image caching that's local for each guest cluster. The context around what image to cache and how long to cache it for is hypershift specific, and which i think is generally outside of the scope of capk.
| func CacheImage(ctx context.Context, cl client.Client, bootImage string, nodePool *hyperv1.NodePool) error { | ||
| kvPlatform := nodePool.Spec.Platform.Kubevirt | ||
| clusterName := nodePool.Spec.ClusterName | ||
| namespace := nodePool.GetNamespace() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need to use the namespace that the VMs will be placed in. not the NodePool namespace.
| func getCachedPVCName(clusterName string) string { | ||
| return bootImagePVCPrefix + clusterName | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll likely have multiple images associated with a single cluster. Can you think of a way to name the pvcs differently so that 1:many relationship can be observed?
| lbls := map[string]string{ | ||
| bootImagePVCLabelApp: bootImagePVCLabelAppValue, | ||
| bootImagePVCLabelType: bootImagePVCLabelTypeValue, | ||
| bootImagePVCLabelCluster: clusterName, | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need either a label or annotation which stores the bootimage hash. This hash is how we'll determine whether or not a new cached DV needs to be created, or whether we can use a previously cached image.
This [1] is where that bootImage is detected today. It's not immediately obvious, but there's a hash in there as well we can retrieve. we return disk.Location url today, but there's also a disk.sha256 that we have access too. example struct [2].
That sha256 is our key when caching, not the location url.
artifact, exists := openStack.Formats["qcow2.gz"] type CoreOSFormat struct {
|
/test kubevirt-e2e-kubevirt-gcp-ovn |
|
@nunnatsa: The specified target(s) for
The following commands are available to trigger optional jobs:
Use
DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
13d9f7f to
a94f5a0
Compare
✅ Deploy Preview for hypershift-docs ready!
To edit notification comments on pull requests, go to your Netlify site settings. |
|
/retest |
a94f5a0 to
5612211
Compare
|
/hold |
5612211 to
4b893cb
Compare
|
/retest |
| imageName := *nodePool.Spec.Platform.Kubevirt.RootVolume.Image.ContainerDiskImage | ||
| imageHash, err := getImageDigest(ctx, imageName) | ||
| if err != nil { | ||
| return "", "", fmt.Errorf("failed to get the image hash; %w", err) | ||
| } | ||
|
|
||
| return containerImagePrefix + imageName, imageHash, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for now, i'd suggest only caching images provided by the release payload, where the sha is known beforehand.
| } | ||
|
|
||
| // replace the PVC with a new one | ||
| err := cl.Delete(ctx, pvc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does the DV need to be cleaned up as well potentially?
Also, shouldn't we only delete if pvc.DeletionTimestamp == nil?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From 4.12, the DV is auto deleted when the PVC is ready.
| return fmt.Errorf("failed to delete the boot image cache PVC; %w", err) | ||
| } | ||
|
|
||
| return createDVForCache(ctx, cl, pvc.Namespace, bootImage, imageHash, kvPlatform, pvc.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't assume that just because a pvc was marked for deletion, that it is actually deleted.
I think it would be wise to give every cached pvc a unique name so we aren't colliding during garbage collection like this.
go.mod
Outdated
| github.com/aws/aws-sdk-go v1.44.84 | ||
| github.com/blang/semver v3.5.1+incompatible | ||
| github.com/clarketm/json v1.14.1 | ||
| github.com/containers/image/v5 v5.23.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's causing all these go.mod changes, and can we avoid it? Ideally i'd like this pr to not involve any changes to the vendor directory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added to read the container image digest, to be used as the image hash. Removed as we don't cache container images for now.
4b893cb to
ef44592
Compare
ef44592 to
fd2bebd
Compare
|
/test kubevirt-e2e-kubevirt-aws-ovn |
fd2bebd to
31f064a
Compare
|
/test kubevirt-e2e-kubevirt-aws-ovn |
31f064a to
0082542
Compare
|
/test kubevirt-e2e-kubevirt-aws-ovn |
0082542 to
b69ae02
Compare
|
rebased |
|
/test e2e-kubevirt-aws-ovn |
davidvossel
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
I made two comments that need to be addressed at some point, but don't have to be fixed right away.
- the nodepool cache strategy cli arg needs to match the cluster create one
- we need to close the small possibility of cache not being deleted when during deletion
I know you're looking at transitioning us to use the kubevirt rhcos image soon, which will touch all these areas again, so we can fix these issues there. This PR has grown so large i'd like to avoid delaying it any further if possible. as long as the local and external e2e tests pass, i think this is good to go
cmd/nodepool/kubevirt/create.go
Outdated
| cmd.Flags().Uint32Var(&platformOpts.RootVolumeSize, "root-volume-size", platformOpts.RootVolumeSize, "The size of the root volume for machines in the NodePool in Gi") | ||
| cmd.Flags().StringVar(&platformOpts.RootVolumeAccessModes, "root-volume-access-modes", platformOpts.RootVolumeAccessModes, "The access modes of the root volume to use for machines in the NodePool (comma-delimited list)") | ||
| cmd.Flags().StringVar(&platformOpts.ContainerDiskImage, "containerdisk", platformOpts.ContainerDiskImage, "A reference to docker image with the embedded disk to be used to create the machines") | ||
| cmd.Flags().StringVar(&platformOpts.CacheStrategyType, "cache-strategy-type", platformOpts.CacheStrategyType, "Set the boot image caching strategy; Supported values:\n- \"None\": no caching (default).\n- \"PVC\": Cache into a PVC; only for QCOW image; ignored for container images") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This cli arg doesn't match the one for the cluster create. I think they should be the same for consistency. For example, the root-volume-size arg is the same between nodepool and cluster create.
s/cache-strategy-type/root-volume-cache-strategy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure. I guess I rename one and forgot the other. fixing
| ns = nodePool.Status.Platform.KubeVirt.RemoteNamespace | ||
| } | ||
|
|
||
| if cl := r.KubevirtInfraClients.GetClient(string(nodePool.GetUID())); cl != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a chance that the cache won't get cleaned up here if we get unlucky and the hypershift-operator either restarts or is updated while a nodepool is being deleted.
What can happen is that the kubevirt infra client cache map might not have an entry for this nodepool if the current invocation of the hypershift-operator was unable to reconcile it before the nodepool deleted... This is because the kubevirt infra client is only cached during the normal reconcile, not during deletion.
This could be solved by storing a reference to the kubevirt infra secret on the nodepool status, similar to how the cache name is stored.
|
@davidvossel - fixed a bug in the new e2e nodepool test + the create nodepool CLI flag name. |
davidvossel
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
|
/test e2e-kubevirt-aws-ovn |
1 similar comment
|
/test e2e-kubevirt-aws-ovn |
Create a new PVC (if missing) to pull the boot image, then modify the KubeVirt node pull template, to clone from the new PVC, instead of pull from each node. Signed-off-by: Nahshon Unna-Tsameret <nunnatsa@redhat.com>
Signed-off-by: Nahshon Unna-Tsameret <nunnatsa@redhat.com>
2ce12df to
3a2d4b0
Compare
|
/test e2e-kubevirt-aws-ovn |
Signed-off-by: Nahshon Unna-Tsameret <nunnatsa@redhat.com>
Signed-off-by: Nahshon Unna-Tsameret <nunnatsa@redhat.com>
|
@davidvossel , I think it is ready now. |
davidvossel
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
/hold cancel
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: davidvossel, nunnatsa The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
@nunnatsa: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
Create a new PVC (if missing) to pull the boot image, then modify the KubeVirt node pull template, to clone from the new PVC, instead of pull from each node.
Signed-off-by: Nahshon Unna-Tsameret nunnatsa@redhat.com
What this PR does / why we need it:
Which issue(s) this PR fixes (optional, use
fixes #<issue_number>(, fixes #<issue_number>, ...)format, where issue_number might be a GitHub issue, or a Jira story:Fixes #
Checklist