enhancement: Sparse Manifest Lists #29

jonathankingfc · 2024-01-30T18:00:46Z

Enhancement proposal for sparse manifest lists

enhancements/sparse-manifest-list.md

sherine-k

Hi @jonathankingfc
This enhancement is pretty interesting as it is also one of the recurring requests of clients on disconnected clusters (use oc-mirror). So thank you!

So far, I found only this OCI spec change that relates to sparse manifest support.

Any specifics that you can share with us about how a push / pull a sparse manifest will look like (at the http level for example)?

I'm not sure how all the clients implement this, but I'm interested to know if what this enhancement proposes would be inline with what skopeo intends to do for sparse manifest: by allowing the client to copy(push) only the index of the image, without the underlying manifests.

Although far from fulfilling all disconnected clusters users' needs, this means that with containers/image (at the base of oc-mirror and skopeo), one can either pull/push:

the underlying manifest that correspond to the current arch/os
the whole index
only the index, which, combined with the first option makes a sparse manifest of a single arch

cc @mtrmac

mtrmac · 2024-02-22T17:46:33Z

enhancements/sparse-manifest-list.md

+
+#### Story 1
+
+A user with a large repository of container images can significantly reduce their storage footprint by using sparse manifest lists, as common layers across different images are stored only once.


(I know ~nothing about Quay)

I can’t see how sparse manifest lists change anything about this. Removing a per-platform image instance can never add a new layer sharing opportunity.

Is this saying that, independently from accepting sparse manifests lists, the scope of layer sharing is going to be increased?

mtrmac · 2024-02-22T17:58:43Z

Current c/image status:

When pulling an image to local storage, only the one chosen platform’s image instance must be present.
When writing a multi-platform image, the caller can choose which per-platform image instances to skip (without removing the existing manifest entries); that creates a sparse image
A feature to strip the per-platform image instances also from the manifest (creating a non-sparse image with fewer platforms) is desired but does not yet exist
Perhaps relevant to Quay, reading a multi-platform image when trying to make a copy (skopeo copy --all) will fail on sparse images, unless the caller specifically and manually uses the “skip some per-platform instances” option mentioned in 2. above. IIRC skopeo copy is used for Quay’s mirroring functionality, so that might need changing (while still requiring an opt-in flag?), to allow exact mirroring of sparse multi-platform images.

wking · 2024-04-22T23:11:41Z

enhancements/sparse-manifest-list.md

+
+#### Story 2
+
+In a bandwidth-constrained environment, a user can pull images from Quay more efficiently, as the sparse manifest list allows downloading only the necessary layers, reducing the data transfer volume.


Clients can pull a portion of an image already, e.g. see this comment in openshift/oc#1334 adding sparse manifest support to oc image mirror .... And I agree with Miloslav's comment about "common layers" seeming orthogonal. My understanding is that the user-story for sparse manifest lists is more like:

Many image authors publish manifest-list images with many architecture-specific children, to support their workload on all of those architectures. Some image consumers only run a subset of those architectures locally. Sparse manifests will allow users mirroring a manifest-list image into their local Quay to only push the architectures they need, while retaining the top-level manifest list. This saves the network bandwidth and local-Quay storage costs of mirroring architectures that are not needed locally. And it preserves the digest and signatures on the original manifest-list.

So for:

$ curl -s https://quay.io/v2/openshift-release-dev/ocp-release/manifests/sha256:39aa3985a4ab715f3ea8d983b72745947249322e4fb4dbcf59b4cc749f4e9ae7 | jq -r '.manifests[] | .digest + " " + (.platform | tostring)' sha256:49821163426f2f2cb5a2b7cb446c35440d6a5c3905397b48b795dd4bc3b5eaf6 {"architecture":"amd64","os":"linux"} sha256:f00ca1a7bef6176803cd54ad8ae878dd48fa86215dd002b834840f01039de045 {"architecture":"ppc64le","os":"linux"} sha256:99696da77b6982057442bdba3854ddd574e5aeba6bd1710e138b8b398b22f883 {"architecture":"s390x","os":"linux"} sha256:a6352c78572180f0e88cbf62f80f7b45074a157d4e3d8ad172e7d77042f06724 {"architecture":"arm64","os":"linux"}

The sha256:39aa398... manifest-list would be pushed into the local Quay, along with sha256:4982116.... amd64 and sha256:a6352c78572... arm64. But sha256:f00ca1a7bef61... ppc64le and sha256:99696da77b69... s390x would not be pushed in. As far as Quay-side changes go, that's almost entirely on the what-can-we-push-into-Quay? side and not on the what-can-we-pull-from-Quay? side.

As Miloslav points out, clients who are pulling from Quay and expecting a full image but receiving a sparse manifest will fail to pull the layers ("hey, this manifest list references sha256:f00ca1a7bef61..., but that is 404ing!"). They'd have to sort that out with some kind knob. And as Miloslav points out, Quay-to-Quay mirroring would also have to handle the source-manifest-list-is-sparse case.

jonathankingfc changed the title ~~enhancement: Sparse manifest lists~~ enhancement: Sparse Manifest Jan 30, 2024

jonathankingfc changed the title ~~enhancement: Sparse Manifest~~ enhancement: Sparse Manifest Lists Jan 30, 2024

syed reviewed Jan 30, 2024

View reviewed changes

enhancements/sparse-manifest-list.md Outdated Show resolved Hide resolved

prb112 reviewed Feb 21, 2024

View reviewed changes

enhancements/sparse-manifest-list.md Outdated Show resolved Hide resolved

sherine-k reviewed Feb 21, 2024

View reviewed changes

enhancement: Sparse manifest lists

20ac519

jonathankingfc force-pushed the sparse_manifest_list branch from 3b23588 to 20ac519 Compare February 22, 2024 17:17

mtrmac reviewed Feb 22, 2024

View reviewed changes

wking reviewed Apr 22, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

enhancement: Sparse Manifest Lists #29

enhancement: Sparse Manifest Lists #29

jonathankingfc commented Jan 30, 2024 •

edited

Loading

sherine-k left a comment

mtrmac Feb 22, 2024 •

edited

Loading

mtrmac commented Feb 22, 2024

wking Apr 22, 2024


		#### Story 1

		A user with a large repository of container images can significantly reduce their storage footprint by using sparse manifest lists, as common layers across different images are stored only once.


		#### Story 2

		In a bandwidth-constrained environment, a user can pull images from Quay more efficiently, as the sparse manifest list allows downloading only the necessary layers, reducing the data transfer volume.

enhancement: Sparse Manifest Lists #29

Are you sure you want to change the base?

enhancement: Sparse Manifest Lists #29

Conversation

jonathankingfc commented Jan 30, 2024 • edited Loading

sherine-k left a comment

Choose a reason for hiding this comment

mtrmac Feb 22, 2024 • edited Loading

Choose a reason for hiding this comment

mtrmac commented Feb 22, 2024

wking Apr 22, 2024

Choose a reason for hiding this comment

jonathankingfc commented Jan 30, 2024 •

edited

Loading

mtrmac Feb 22, 2024 •

edited

Loading