-
Notifications
You must be signed in to change notification settings - Fork 100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support cross-repository mounting in Copy #580
Comments
We may also want a callback function on mounting. |
I just tried to implement mounting outside of ORAS. The idea is to use the PreCopy callback to try and mount the blob first. Turns out the mounting worked great but in ORAS options := oras.ExtendedCopyGraphOptions{}
// try mounting before we copy.
if mounter, ok := dest.(registry.Mounter); ok {
options.PreCopy = func(ctx context.Context, desc ocispec.Descriptor) error {
if srcReference.Registry != destRegistry {
return nil
}
if IsManifest(desc.MediaType) {
// we do not want to try mounting manifests of any type
return nil
}
// Trying mount
err := mounter.Mount(ctx, desc, srcReference.Repository, nil)
if err != nil {
// ignoring mount error
}
// Mount succeeded
return nil
}
oras.ExtendedCopyGraph(ctx, src, dest, root, options) I have this hack implemented https://github.com/ktarplee/oras-go/tree/580-mount |
@Wwwsylvia Is there anything I can do to help on this? |
Sure, we can discuss the overall design first and assign this to you if you are interested in contributing. |
Do you mean that users can implement mounting outside of As of the implementation, my rough thinking is that we can have something like below, func mountOrCopyNode(ctx context.Context, src content.ReadOnlyStorage, dst content.Storage, desc ocispec.Descriptor, opts CopyGraphOptions) error {
mounted := tryMount(ctx, src, dst, desc, opts)
if !mounted {
// fallback to copy if unable to mount
return copyNode(ctx, src, dst, desc, opts)
}
if opts.OnMounted != nil {
return opts.OnMounted(ctx, desc)
}
return nil
} where Lines 102 to 104 in 47d028a
Note: docker CLI has a special output like this for mounted blobs: |
@Wwwsylvia since Mount(ctx context.Context,
desc ocispec.Descriptor,
fromRepo string,
getContent func() (io.ReadCloser, error),
) error I have a question. How are you proposing getting Another option is to introduce a new interface that exposes a target's reference (registry and repository is what you actually need). Then type assert within type MounterTo interface {
// MountTo will mount desc using the provided mounter of the destination
MountTo(ctx context.Context, mounter content.Mounter, desc ocispec.Descriptor) error
} The registry.Repository would implement As an aside, the functionality of |
My gut feeling is that we should add a new function like |
One question related to ORAS CLI's user experience does depend on oras-go implementation: should we show E.g. suppose we have below artifact stored in the source repository graph TD;
M["Manifest (sha256:d9c5ac6a727e)"] --> C("Config (sha256:44136fa355b3)")
M --> B("Blob (sha256:181210f8f9c7)")
What should be the proper output if we use Option 1 Copying 181210f8f9c7 blob
Copying 44136fa355b3 application/vnd.oci.empty.v1+json
Mounted 181210f8f9c7 blob
Mounted 44136fa355b3 application/vnd.oci.empty.v1+json
Copying d9c5ac6a727e application/vnd.oci.image.manifest.v1+json
Copied d9c5ac6a727e application/vnd.oci.image.manifest.v1+json
Copied [registry] localhost:5000/src/repo:test => [registry] localhost:5000/dest/repo:test
Digest: sha256:d9c5ac6a727e800e3d1403ef19fa28bb78b4aae059da4595387077fd7655bf32 Option 2 Mounted 181210f8f9c7 blob
Mounted 44136fa355b3 application/vnd.oci.empty.v1+json
Copying d9c5ac6a727e application/vnd.oci.image.manifest.v1+json
Copied d9c5ac6a727e application/vnd.oci.image.manifest.v1+json
Copied [registry] localhost:5000/src/repo:test => [registry] localhost:5000/dest/repo:test
Digest: sha256:d9c5ac6a727e800e3d1403ef19fa28bb78b4aae059da4595387077fd7655bf32 If the latter is preferred, then the implementation of oras-go can skip calling |
1.
I'm personally leaning towards this option that introduces a new interface to expose the Repository's reference. The interface can be used together with oras-go/registry/remote/auth/scope.go Lines 74 to 80 in 6dfbe52
The problem is, how do we name the function and the interface? For now, I can only think of something like this (🫤): package registry
type Namer interface {
Name() Reference
} 2.
Regarding this option, my concern is that having both 3.
I don't think it's necessary to check the node existence after
@shizhMSFT What do you think? |
The option 2 looks concise to me. |
@Wwwsylvia It seems that we are over complicating stuffs. Generally, the client should attempt mounting only if the client has the knowledge of the existence of the layers in other repositories of the same registry instance where this prerequisite is prefect for @qweeah Option 2 LGTM. |
To add to @shizhMSFT good point. Skopeo/podman keep a "blob info cache" (boltdb on disk I believe) that any operation they do updates the blob info cache. They consult the blob info cache every time they try to push a blob to see if they know of a repository on that registry where the blob is already present. It does not need to be the source repository. You can be copying from type CopyGraphOptions struct {
// existing fields
// AttemptMount attempts to mount the current descriptor. Can be used to avoid an expensive copy.
AttemptMount func(ctx context.Context, desc ocispec.Descriptor) error
} Then maybe oras-go can provide a simple implementation, something like: package registry
func AttemptMountFromSource(src *registry.Repository, dst *registry.Repository) func (ctx context.Context, desc ocispec.Descriptor) error {
return func(ctx context.Context, desc ocispec.Descriptor) error {
if src.Reference.Registry != dst.Registry {
return nil
}
if IsManifest(desc.MediaType) {
// we do not want to try mounting manifests of any type
return nil
}
// Trying mount
err := dst.Mount(ctx, desc, src.Reference.Repository, nil)
if err != nil {
// ignoring mount error
}
// Mount succeeded
return nil
} Then applications can do more extensive blob tracking and hopefully be able to mount more often. I think |
Oh I forgot that |
@Wwwsylvia I suspect you are not interested in changing the Mounter interface to be One approach would be to allow |
@ktarplee Sorry for replying late, was busy with other stuffs. Requests involved in "mount or copy" approach:
Requests involved in "try mount or fail" plus "copy" approach:
Our design should take the following requirements into account:
@shizhMSFT and I just discussed offline, and here is a POC we currently have: package oras
type CopyGraphOptions struct {
// MountPoint returns the repository name of the mounting source.
// If nil, no mounting will be attempted.
MountPoint func(ctx context.Context, desc ocispec.Descriptor) (string, error)
// OnMounted will be invoked when desc is mounted.
OnMounted func(ctx context.Context, desc ocispec.Descriptor) error
}
func mountOrCopyNode(ctx context.Context, src content.ReadOnlyStorage, dst content.Storage, desc ocispec.Descriptor, opts CopyGraphOptions) error {
// copy if mount is not applicable
if descriptor.IsManifest(desc) {
return copyNode(ctx, src, dst, desc, opts)
}
if opts.MountPoint == nil {
return copyNode(ctx, src, dst, desc, opts)
}
mounter, ok := dst.(registry.Mounter)
if !ok {
return copyNode(ctx, src, dst, desc, opts)
}
// try mount
fromRepo, err := opts.MountPoint(ctx, desc)
if err != nil {
return err
}
var mountFailed bool
getContent := func() (io.ReadCloser, error) {
if opts.PreCopy != nil {
if err := opts.PreCopy(ctx, desc); err != nil {
return nil, err
}
}
// the invocation of getContent indicates that mounting is failed
mountFailed = true
return src.Fetch(ctx, desc)
}
if err := mounter.Mount(ctx, desc, fromRepo, getContent); err != nil {
// ignore mounting error and fallback to copy
if err := doCopyNode(ctx, src, dst, desc); err != nil {
return err
}
}
if !mountFailed {
// successfully mounted
if opts.OnMounted != nil {
return opts.OnMounted(ctx, desc)
}
return nil
}
// the node is copied instead of mounted
if opts.PostCopy != nil {
return opts.PostCopy(ctx, desc)
}
return nil
} The idea is that we can leave it to the caller to determine whether to mount and where to mount from. func Example() {
src := NewLocalTarget()
dst := NewRemoteTarget(ref)
db := LoadLocalMetadataDB()
opts := CopyGraphOptions{
MountPoint: func(ctx context.Context, desc ocispec.Descriptor) (string, error) {
refs := db.Query(desc)
refs = refs.Filter(ref.Registry)
if len(refs) == 0 {
return "", errdef.ErrNotFound
}
return refs[0], nil
},
}
oras.CopyGraph(ctx, src, dst, root, opts)
} And regular workflow that simply mounts from src to dst can be done with: func Example() {
src := NewRemoteTarget(srcRef)
dst := NewRemoteTarget(dstRef)
opts := CopyGraphOptions{
MountPoint: func(ctx context.Context, desc ocispec.Descriptor) (string, error) {
return srcRef.Repository, nil
},
}
oras.CopyGraph(ctx, src, dst, root, opts)
} |
@Wwwsylvia I think what are proposing does meet your criterion and if I have time I will work on implementing it. However it does make one assumption that I want to call out (not that we should change anything). That is, it is tightly coupled to the OCI distribution mount specification and that seems overly restrictive for a generic It is easy to conceive of ways to do a more generalized mount from a registry implementation perspective saving upload bandwidth of the server or client. For example, a registry might provide a non-standard API to allow clients to provide an IPFS address to pull content from. Or if the registry is already backed by IPFS then the operation is a no-op (if the registry trusts that the blob will stay around in IPFS). If the blob is stored in object storage (shared by the registry) then there might be a way to "mount" that via the storage providers API. I slightly less far fetched idea is if a infrastructure provider provides many registries all backed by the same storage. Then mounting between those registries is a zero-copy operation and would benefit from a generalized approach specific to that registry. I realize these are a little far fetched however this design does not allow the necessary flexibility for someone to tie these into ORAS. In comparison, the hack I proposed at the beginning of this issue does have this property since mounting can be implemented in the |
@ktarplee Thanks for the insights!
Your concern makes sense. Indeed
As long as we have a use case, exporting cc @shizhMSFT for comments. |
Hi @ktarplee, if there is no other concern/option, would you like to implement the |
I have already implemented ErrSkipDesc. I am working on a method for |
@Wwwsylvia I think #631 should be merged regardless of which direction we decide to go for more robust support of mounting. At least users of ORAS can do what they want w.r.t. supporting mounting. Without #631 users cannot support mounting with oras.Copy because even if the mount succeeds the copy code still copies the blob needlessly. |
Export `ErrSkipDesc` as `ErrSkipDesc` allows users of the library to implement `CopyGraphOptions.PreCopy` to call `Mount()` and then return `ErrSkipDesc` from `PreCopy` which bypasses the downstream copy operation. Closes #580 Signed-off-by: Kyle M. Tarplee <[email protected]>
Another optimization that has not been considered yet (nor implemented in #632 is... |
@shizhMSFT Correctly pointed out that the same number of requests is made in either case (mount support or mount unsupported), so we do not need to worry about storing the registries's ability to mount. We can safely and efficiently try a mount in every case (since the fallback is not an extra HTTP request but rather one that is needed anyway). |
Since we have
Obviously in all cases, if the mount(s) fail then we need to copy. |
Adds MountFrom and OnMounted to CopyGraphOptions. Allows for trying to mount from multiple repositories. Closes #580 I think this is a better approach than my other PR #632 Signed-off-by: Kyle M. Tarplee <[email protected]>
We have already have Mounter implemented for Repository, we can consider leveraging it in Copy and CopyGraph for better performance.
The text was updated successfully, but these errors were encountered: