This is a utility for quickly fetching OCI images onto Kubernetes cluster nodes.
Talks directly to Container Runtime Interface (CRI) API to:
- fetch all images on all nodes in parallel,
- retry pulls with increasingly longer timeouts. This prevents getting stuck on stalled connections to image registry.
It also optionally collects each pull attempt's duration and result.
- main binary,
- shipped as an OCI image,
- provides three subcommands:
fetch: runs the actual image pulls via CRI, meant to run as an init container of DaemonSet pods. Requires access to the CRI UNIX domain socket from the host.sleep: just sleeps forever, meant to run as the main container of DaemonSet pods.aggregate-metrics: runs a gRPC server which collects data points pushed by thefetchpods, and makes the data available for download over HTTP. Meant to run as a standalone pod.
- a helper command-line utility for generating
image-prefetchermanifests, - separate go module, with no dependencies outside Go standard library.
-
First, run the
deploybinary to generate a manifest for an instance ofimage-prefetcher.You can run many instances independently.
It requires a single positional argument for the name of the instance. This also determines the name of a
ConfigMapsupplying names of images to fetch.It also accepts a few optional flags:
--version:image-prefetcherOCI image tag. See list of existing tags. Additionally, a version in the formatvX.Y.Z-N.NNNN-HEXwill be transformed tosha-HEXwhich makes it easier to test pre-release images based on a version generated bygo mod tidy.--namespace: namespace where the image prefetcher will be deployed (default:default). Used for ClusterRoleBinding. Must be specified unless deploying to thedefaultnamespace.--k8s-flavordepending on the cluster. Currently one of:vanilla: a generic Kubernetes distribution without additional restrictions.ocp: OpenShift, which requires explicitly granting special privileges.
--secret: image pullSecretname. Required if the images are not pullable anonymously. This image pull secret should be usable for all images fetched by the given instance. If provided, it must be of typekubernetes.io/dockerconfigjsonand exist in the same namespace.--collect-metrics: if the image pull metrics should be collected.
Example:
go run github.com/stackrox/image-prefetcher/deploy@v0.3.0 --version v0.3.0 --namespace prefetch-images my-images > manifest.yaml -
Prepare an image list. This should be a plain text file with one image name per line. Lines starting with
#and blank ones are ignored.echo debian:latest >> image-list.txt echo quay.io/strimzi/kafka:latest-kafka-3.7.0 >> image-list.txt -
Deploy:
kubectl create namespace prefetch-images kubectl create -n prefetch-images configmap my-images --from-file="images.txt=image-list.txt" kubectl apply -f manifest.yaml -
Wait for the pull to complete, with a timeout:
kubectl rollout -n prefetch-images status daemonset my-images --timeout 5m -
If something goes wrong, look at logs:
kubectl logs -n prefetch-images daemonset/my-images -c prefetch -
If metrics collection was requested, wait for the endpoint to appear, and fetch them:
attempt=0 service="service/my-images-metrics" while [[ -z $(kubectl -n "${ns}" get "${service}" -o jsonpath="{.status.loadBalancer.ingress}" 2>/dev/null) ]]; do if [ "$attempt" -lt "60" ]; then echo "Waiting for ${service} to obtain endpoint ..." ((attempt++)) sleep 10 else echo "Timeout waiting for ${service} to obtain endpoint!" exit 1 fi done endpoint="$(kubectl -n "${ns}" get "${service}" -o json | jq -r '.status.loadBalancer.ingress[] | .ip')" curl "http://${endpoint}:8080/metrics" | jqSee the Result message definition for a list of fields.
The image prefetcher automatically labels nodes to indicate whether all images were successfully prefetched. This allows using label selectors to schedule pods only on nodes where images are available.
For detailed information about label format, usage examples, and RBAC requirements, see docs/labels.md.
You can tweak certain parameters such as timeouts by editing args in the above manifest.
See the fetch command for accepted flags.
This utility was designed for small, ephemeral test clusters, in order to improve reliability and speed of end-to-end tests.
If deployed on larger clusters, it may have a "thundering herd" effect on the OCI registries it pulls from. This is because all images are pulled from all nodes in parallel.
- Pick a tag name, use the usual semver rules. We'll refer to it as
vx.y.zbelow - Draft a new release
- Enter
vx.y.zas the name of a new tag to create - Click "Create new tag on publish"
- Keep
masteras target - Keep
autoas previous tag - Click "Generate release notes"
- Optional: edit the release notes as you see fit
- Enter
- Publish the release
- Make sure the build GitHub Action that gets triggered by the tag runs successfully and pushes images.
- It is also a good idea to wait for the e2e job to pass before proceeding.
- Create a tag for the
deploymodule- This is the tag that
go run github.com/stackrox/image-prefetcher/deploy@vx.y.zlooks for (since itsgo.modis not in the repository root) - Currently, this needs to be done manually since GitHub UI does not seem to allow creation of tags without an associated release. TODO: automate this
- Check out the tagged commit in your clone
git tag deploy/vx.y.zgit push --tags
- This is the tag that