Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
111 changes: 111 additions & 0 deletions enhancements/sparse-manifest-list.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
---
title: Sparse Manifest List Support
authors:
- "@jonathankingfc"
reviewers:
- "@dmage"
- "@bcaton"
approvers:
- TBD
- "@oscardoe"
creation-date: 2023-12-21
last-updated: 2024-01-04
status: provisional
---

# Quay Container Registry Sparse Manifest List Support

This enhancement proposes the addition of sparse manifest list support to the Quay container registry. The goal is to enable more efficient storage and retrieval of container images, especially in environments where network bandwidth or storage efficiency is a concern. By implementing sparse manifest lists, Quay will be able to more effectively manage container images composed of multiple layers, optimizing storage and reducing data transfer requirements.

## Release Signoff Checklist

- [ ] Enhancement is `implementable`
- [ ] Design details are appropriately documented from clear requirements
- [ ] Test plan is defined
- [ ] Graduation criteria for dev preview, tech preview, GA

## Open Questions [optional]

> 1. What are the potential impacts on Quay's existing storage backend when implementing sparse manifest lists?
> 2. How will this enhancement affect existing Quay users, and what migration steps are necessary?
> 3. What changes will be required in client software to be compatible with handling sparse manifest lists? This includes Podman and OC Mirror

## Summary

The introduction of sparse manifest list support in Quay container registry aims to increase the efficiency of how container images are stored and managed. This enhancement focuses on optimizing the use of storage space and minimizing the data transfer when pulling images from the registry. The feature will be particularly beneficial for users with large image repositories or those operating in bandwidth-constrained environments.

## Motivation

With the growing size of container images and the increasing need for efficiency in container registry operations, there's a clear need for more advanced storage solutions. Sparse manifest lists offer a compelling approach to address these challenges. This enhancement seeks to leverage these benefits to improve Quay's functionality and user experience. This will also help reduce costs on Openshift in supporting OCP installations in various architectures.

### Goals

- Implement sparse manifest list support in Quay container registry.
- Optimize storage utilization and data transfer efficiency.
- Ensure backward compatibility and minimal disruption for current users.
- Ensure that existing clients are able to handle pulling these images, and that other clients that do not support this feature necessarily are not broken

### Non-Goals

- Overhaul of Quay's existing storage architecture.
- Disruption of existing workflows for current Quay users.

## Proposal

The proposal includes developing and integrating sparse manifest list support into the Quay container registry. This will involve changes in how Quay stores and handles container images, focusing on layer deduplication and efficient data management.

### User Stories [optional]

#### Story 1

A user with a large repository of container images can significantly reduce their storage footprint by using sparse manifest lists, as common layers across different images are stored only once.
Copy link

@mtrmac mtrmac Feb 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I know ~nothing about Quay)

I can’t see how sparse manifest lists change anything about this. Removing a per-platform image instance can never add a new layer sharing opportunity.

Is this saying that, independently from accepting sparse manifests lists, the scope of layer sharing is going to be increased?


#### Story 2

In a bandwidth-constrained environment, a user can pull images from Quay more efficiently, as the sparse manifest list allows downloading only the necessary layers, reducing the data transfer volume.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clients can pull a portion of an image already, e.g. see this comment in openshift/oc#1334 adding sparse manifest support to oc image mirror .... And I agree with Miloslav's comment about "common layers" seeming orthogonal. My understanding is that the user-story for sparse manifest lists is more like:

Many image authors publish manifest-list images with many architecture-specific children, to support their workload on all of those architectures. Some image consumers only run a subset of those architectures locally. Sparse manifests will allow users mirroring a manifest-list image into their local Quay to only push the architectures they need, while retaining the top-level manifest list. This saves the network bandwidth and local-Quay storage costs of mirroring architectures that are not needed locally. And it preserves the digest and signatures on the original manifest-list.

So for:

$ curl -s https://quay.io/v2/openshift-release-dev/ocp-release/manifests/sha256:39aa3985a4ab715f3ea8d983b72745947249322e4fb4dbcf59b4cc749f4e9ae7 | jq -r '.manifests[] | .digest + " " + (.platform | tostring)'
sha256:49821163426f2f2cb5a2b7cb446c35440d6a5c3905397b48b795dd4bc3b5eaf6 {"architecture":"amd64","os":"linux"}
sha256:f00ca1a7bef6176803cd54ad8ae878dd48fa86215dd002b834840f01039de045 {"architecture":"ppc64le","os":"linux"}
sha256:99696da77b6982057442bdba3854ddd574e5aeba6bd1710e138b8b398b22f883 {"architecture":"s390x","os":"linux"}
sha256:a6352c78572180f0e88cbf62f80f7b45074a157d4e3d8ad172e7d77042f06724 {"architecture":"arm64","os":"linux"}

The sha256:39aa398... manifest-list would be pushed into the local Quay, along with sha256:4982116.... amd64 and sha256:a6352c78572... arm64. But sha256:f00ca1a7bef61... ppc64le and sha256:99696da77b69... s390x would not be pushed in. As far as Quay-side changes go, that's almost entirely on the what-can-we-push-into-Quay? side and not on the what-can-we-pull-from-Quay? side.

As Miloslav points out, clients who are pulling from Quay and expecting a full image but receiving a sparse manifest will fail to pull the layers ("hey, this manifest list references sha256:f00ca1a7bef61..., but that is 404ing!"). They'd have to sort that out with some kind knob. And as Miloslav points out, Quay-to-Quay mirroring would also have to handle the source-manifest-list-is-sparse case.


### Implementation Details/Notes/Constraints [optional]

Key considerations for the implementation include ensuring that the new feature integrates smoothly with Quay's existing architecture and that it maintains compatibility with existing OCI image specifications.

### Risks and Mitigations

The primary risks involve potential incompatibilities with existing storage backends and client tools. Mitigation strategies include thorough testing, backward compatibility checks, and clear documentation for users about any new requirements or changes in behavior.

## Design Details

### Test Plan

- Unit and integration tests for new features.
- E2E tests simulating real-world usage scenarios.
- Performance benchmarking to measure improvements in storage efficiency and data transfer.

### Graduation Criteria

- Successful implementation and testing in a dev preview environment.
- Positive feedback from initial user testing.
- Documentation and tutorials for users to leverage the new feature.

### API Design

The API endpoints will remain largely unchanged, with modifications primarily in the internal handling of image manifests and layers.

### Upgrade / Downgrade Strategy

A clear migration path will be provided for users to upgrade to the new version of Quay with sparse manifest list support. For downgrade, users will be able to revert to their previous version with standard rollback procedures.

### Version Skew Strategy

Compatibility with different versions of Quay and client tools will be maintained. Any potential issues will be identified and addressed during the testing phase.

## Implementation History

- 2024-01-10: Proposal for adding sparse manifest list support to Quay.

## Drawbacks

The complexity of implementing sparse manifest lists might pose challenges, especially in ensuring compatibility with existing setups.

## Alternatives

An alternative could be to enhance existing image layer compression methods, although this would not