Skip to content

nydus: support host-sharing#131

Merged
csegarragonz merged 34 commits intomainfrom
nydus-host
Feb 11, 2025
Merged

nydus: support host-sharing#131
csegarragonz merged 34 commits intomainfrom
nydus-host

Conversation

@csegarragonz
Copy link
Collaborator

@csegarragonz csegarragonz commented Jan 20, 2025

The main issue to support host-sharing is the co-existence of guest-pulling and host-sharing versions of the nydus snapshotter, as well as other non-nydus snapshotters.

Before starting a pod sandbox, containerd will check if the pause image is present on the machine. If it is not it will "pull it" with the help of the snapshotter. This "pull" is crucial as the nydus snapshotter will export the right Kata virtual colume to handle the pull:

  • For guest-pull it will generate a GuestImagePull virtual volume that will make the Kata Agent try to pull_image inside the guest (for the pause image this is just unpacking the bundle in the initrd).
  • For host-share it will convert the OCI layers to tarfs layers, and generate virtual volumes that indicate the blobs to mount into the guest.

Consequently, if we skip the "pull" that happens during ensureImageExists, we never generate this virtual volumes, and execution fails. In order to make sure we pull when we need to, containerd needs to keep a per-snapshotter map of what images it has already pulled. Here's the catch: guest-pull and host-share are technically the same snapshotter. (This also applies for variations of the host-share mode: image_block, layer_block and each one with _verity.)

The solution we will adopt is to install host-sharing as a "different" snapshotter (which we implement here) together with a patch in containerd that keeps track of what images have been pulled on a per-snapshotter basis (not on a global basis).

On top of that, we need to do some work on Kata, but most of it is already here: kata-containers/kata-containers#7837. The only notable additions are to manually start the udev daemon, as we are still using the Kata Agent as /init, and to check that the (host-)mounted dm-verity hashes actually correspond to the layer digests. The latter will wait until we re-introduce image signature validation and attestation, as we need to get the ground truth from somewhere.

Another issue we ran into while testing is that, if two layers have the same digest, the tarfs module in the nydus-snapshotter will sometimes trigger an error due to a race condition.

Once this PR is merged in, both snapshotters should be usable without having to restart the snapshotters by using:

inv nydus-snapshotter.set-mode [guest-pull|host-share]

the host-share mode uses layer_block_with_verity. With guest-pull mode, however, there is an open issue in the upstream repo regarding snapshotter restarts: containerd/nydus-snapshotter#631. This means that, in general, when changing the snapshotter mode it is safer to purge all snapshots:

inv nydus-snapshotter.purge

Purging also proved tricky, as it is not enough to remove the contents of /var/lib/containerd-nydus*. containerd keeps track of snapshot metadata in its metadata DB in /var/lib/containerd/io.containerd.metadata.v1.bolt/meta.db. We cannot easily delete elements from that DB, so after removing the snapshots manually, we wait for the GC to remove the elements from the DB.

@csegarragonz csegarragonz marked this pull request as ready for review February 3, 2025 12:25
@csegarragonz csegarragonz merged commit 1bca71a into main Feb 11, 2025
3 checks passed
@csegarragonz csegarragonz deleted the nydus-host branch February 11, 2025 16:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant