Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support cross-repository mounting in Copy #580

Closed
Wwwsylvia opened this issue Aug 30, 2023 · 23 comments · Fixed by #631 or #665
Closed

Support cross-repository mounting in Copy #580

Wwwsylvia opened this issue Aug 30, 2023 · 23 comments · Fixed by #631 or #665
Assignees
Labels
enhancement New feature or request perf Performance related issues
Milestone

Comments

@Wwwsylvia
Copy link
Member

We have already have Mounter implemented for Repository, we can consider leveraging it in Copy and CopyGraph for better performance.

@Wwwsylvia Wwwsylvia added the enhancement New feature or request label Aug 30, 2023
@Wwwsylvia Wwwsylvia added this to the v2.4.0 milestone Aug 30, 2023
@Wwwsylvia
Copy link
Member Author

Wwwsylvia commented Aug 31, 2023

We may also want a callback function on mounting.

@yizha1 yizha1 added the perf Performance related issues label Sep 5, 2023
@ktarplee
Copy link
Contributor

ktarplee commented Oct 5, 2023

I just tried to implement mounting outside of ORAS. The idea is to use the PreCopy callback to try and mount the blob first. Turns out the mounting worked great but in ORAS copyNode() calls doCopyNode() which calls Fetch() without calling Exists() again to see if anything changed in the storage. One simple fix would be if copyNode() called Exists() one more time, right after PreCopy() and before doCopyNode(). That is an extra call to Exists() (inefficient) but it gets rid of the ugliness related to Mount logic (e.g., where do you get the source repo from, how do you get the destination registry to check if mounting even makes sense).

options := oras.ExtendedCopyGraphOptions{}

// try mounting before we copy.
if mounter, ok := dest.(registry.Mounter); ok {
	options.PreCopy = func(ctx context.Context, desc ocispec.Descriptor) error {
	if srcReference.Registry != destRegistry {
		 return nil
	}

	if IsManifest(desc.MediaType) {
		// we do not want to try mounting manifests of any type
		return nil
	}

	// Trying mount
	err := mounter.Mount(ctx, desc, srcReference.Repository, nil)
	if err != nil {
		// ignoring mount error
	}
	// Mount succeeded
	return nil
}

oras.ExtendedCopyGraph(ctx, src, dest, root, options)

I have this hack implemented https://github.com/ktarplee/oras-go/tree/580-mount

@ktarplee
Copy link
Contributor

ktarplee commented Oct 8, 2023

@Wwwsylvia Is there anything I can do to help on this?

@Wwwsylvia
Copy link
Member Author

@Wwwsylvia Is there anything I can do to help on this?

Sure, we can discuss the overall design first and assign this to you if you are interested in contributing.

@Wwwsylvia
Copy link
Member Author

Wwwsylvia commented Oct 9, 2023

I just tried to implement mounting outside of ORAS. The idea is to use the PreCopy callback to try and mount the blob first. Turns out the mounting worked great but in ORAS copyNode() calls doCopyNode() which calls Fetch() without calling Exists() again to see if anything changed in the storage. One simple fix would be if copyNode() called Exists() one more time, right after PreCopy() and before doCopyNode(). That is an extra call to Exists() (inefficient) but it gets rid of the ugliness related to Mount logic (e.g., where do you get the source repo from, how do you get the destination registry to check if mounting even makes sense).

options := oras.ExtendedCopyGraphOptions{}

// try mounting before we copy.
if mounter, ok := dest.(registry.Mounter); ok {
	options.PreCopy = func(ctx context.Context, desc ocispec.Descriptor) error {
	if srcReference.Registry == destRegistry {
		 return nil
	}

	if IsManifest(desc.MediaType) {
		// we do not want to try mounting manifests of any type
		return nil
	}

	// Trying mount
	err := mounter.Mount(ctx, desc, srcReference.Repository, nil)
	if err != nil {
		// ignoring mount error
	}
	// Mount succeeded
	return nil
}

oras.ExtendedCopyGraph(ctx, src, dest, root, options)

I have this hack implemented https://github.com/ktarplee/oras-go/tree/580-mount

Do you mean that users can implement mounting outside of oras-go though PreCopy? That works but it requires efforts from every user who wants the mounting feature.
The idea of this feature request was to implement mounting as a built-in feature in oras-go so that users can gain better performance out of box.

As of the implementation, my rough thinking is that we can have something like below,

func mountOrCopyNode(ctx context.Context, src content.ReadOnlyStorage, dst content.Storage, desc ocispec.Descriptor, opts CopyGraphOptions) error {
	mounted := tryMount(ctx, src, dst, desc, opts)
	if !mounted {
		// fallback to copy if unable to mount
		return copyNode(ctx, src, dst, desc, opts)
	}

	if opts.OnMounted != nil {
		return opts.OnMounted(ctx, desc)
	}
	return nil
}

where OnMounted is an optional callback function similar to OnCopySkipped allowing CLI to customize the output. /cc @qweeah who can provide insights about output of ORAS CLI.

oras-go/copy.go

Lines 102 to 104 in 47d028a

// OnCopySkipped will be called when the sub-DAG rooted by the current node
// is skipped.
OnCopySkipped func(ctx context.Context, desc ocispec.Descriptor) error

Note: docker CLI has a special output like this for mounted blobs:
image

@ktarplee
Copy link
Contributor

ktarplee commented Oct 9, 2023

@Wwwsylvia since mounted := tryMount(ctx, src, dst, desc, opts) will ultimately call

Mount(ctx context.Context,
	desc ocispec.Descriptor,
	fromRepo string,
	getContent func() (io.ReadCloser, error),
) error

I have a question. How are you proposing getting fromRepo? Is this another field in the opts (i.e., CopyGraphOptions) -- something like SourceReference registry.Reference? This breaks your encapsulation so that would not be a good option.

Another option is to introduce a new interface that exposes a target's reference (registry and repository is what you actually need). Then type assert within tryMount() and things work out. Repository would need to implement this.
Or better yet implement a MountTo interface similar in spirit to io.WriterTo, something like

type MounterTo interface {
    // MountTo will mount desc using the provided mounter of the destination
    MountTo(ctx context.Context, mounter content.Mounter, desc ocispec.Descriptor) error
} 

The registry.Repository would implement MountTo() so that it could be the source of a mount and implement (as it already does) Mount() so it can be the target of a mount. I guess you still have the issue that before you try to mount you should check to see if the registry is the same. The registry check is really more of an optimization but helps avoid many unnecessary round-trips.

As an aside, the functionality of PreCopy could be improved (at least allowing fancy mounting while copying) if an additional existence check was added after PreCopy but before the actual copy. See the diff here. I'm not sure how this impacts the caching proxy within ORAS.

@qweeah
Copy link
Contributor

qweeah commented Oct 10, 2023

where OnMounted is an optional callback function similar to OnCopySkipped allowing CLI to customize the output. /cc @qweeah who can provide insights about output of ORAS CLI.

My gut feeling is that we should add a new function like doMountNode() and call it in copyNode() before doCopyNode(). Also caller of oras.Copy can setup PostMount callback.

@qweeah
Copy link
Contributor

qweeah commented Oct 10, 2023

One question related to ORAS CLI's user experience does depend on oras-go implementation: should we show Copying ... logs for a blob if it's mounted?

E.g. suppose we have below artifact stored in the source repository

graph TD;
M["Manifest (sha256:d9c5ac6a727e)"] --> C("Config (sha256:44136fa355b3)")
M --> B("Blob (sha256:181210f8f9c7)")
Loading

What should be the proper output if we use oras cp to copy it with mounting support?

Option 1

Copying 181210f8f9c7 blob
Copying 44136fa355b3 application/vnd.oci.empty.v1+json
Mounted 181210f8f9c7 blob
Mounted 44136fa355b3 application/vnd.oci.empty.v1+json
Copying d9c5ac6a727e application/vnd.oci.image.manifest.v1+json
Copied  d9c5ac6a727e application/vnd.oci.image.manifest.v1+json
Copied [registry] localhost:5000/src/repo:test => [registry] localhost:5000/dest/repo:test
Digest: sha256:d9c5ac6a727e800e3d1403ef19fa28bb78b4aae059da4595387077fd7655bf32

Option 2

Mounted 181210f8f9c7 blob
Mounted 44136fa355b3 application/vnd.oci.empty.v1+json
Copying d9c5ac6a727e application/vnd.oci.image.manifest.v1+json
Copied  d9c5ac6a727e application/vnd.oci.image.manifest.v1+json
Copied [registry] localhost:5000/src/repo:test => [registry] localhost:5000/dest/repo:test
Digest: sha256:d9c5ac6a727e800e3d1403ef19fa28bb78b4aae059da4595387077fd7655bf32

If the latter is preferred, then the implementation of oras-go can skip calling PreCopy() for mounted blobs.

@Wwwsylvia
Copy link
Member Author

1.

I have a question. How are you proposing getting fromRepo? Is this another field in the opts (i.e., CopyGraphOptions) -- something like SourceReference registry.Reference? This breaks your encapsulation so that would not be a good option.

Another option is to introduce a new interface that exposes a target's reference (registry and repository is what you actually need). Then type assert within tryMount() and things work out. Repository would need to implement this.

I'm personally leaning towards this option that introduces a new interface to expose the Repository's reference. The interface can be used together with Mounter and other functions that require a registry.Reference, such as:

func AppendRepositoryScope(ctx context.Context, ref registry.Reference, actions ...string) context.Context {
if len(actions) == 0 {
return ctx
}
scope := ScopeRepository(ref.Repository, actions...)
return AppendScopesForHost(ctx, ref.Host(), scope)
}

The problem is, how do we name the function and the interface? For now, I can only think of something like this (🫤):

package registry

type Namer interface {
	Name() Reference
}

2.

Or better yet implement a MountTo interface similar in spirit to io.WriterTo, something like

type MounterTo interface {
    // MountTo will mount desc using the provided mounter of the destination
    MountTo(ctx context.Context, mounter content.Mounter, desc ocispec.Descriptor) error
} 

The registry.Repository would implement MountTo() so that it could be the source of a mount and implement (as it already does) Mount() so it can be the target of a mount. I guess you still have the issue that before you try to mount you should check to see if the registry is the same. The registry check is really more of an optimization but helps avoid many unnecessary round-trips.

Regarding this option, my concern is that having both Mount and MountTo in Repository could be confusing. And we still need a way to check the registry name, as the cross-repository mounting can only work within the same registry.

3.

As an aside, the functionality of PreCopy could be improved (at least allowing fancy mounting while copying) if an additional existence check was added after PreCopy but before the actual copy. See the diff here. I'm not sure how this impacts the caching proxy within ORAS.

I don't think it's necessary to check the node existence after PreCopy, for the following reasons:

  • The check introduces an extra network request to the dst repository for every node to be copied (the caching proxy is for manifests in the src repository)
  • I doubt if users will really implement mounting inside PreCopy

@shizhMSFT What do you think?

@Wwwsylvia
Copy link
Member Author

One question related to ORAS CLI's user experience do depends on oras-go implementation: should we show Copying ... logs for a blob if it's mounted?

E.g. suppose we have below artifact stored in the source repository

graph TD;
M["Manifest (sha256:d9c5ac6a727e)"] --> C("Config (sha256:44136fa355b3)")
M --> B("Blob (sha256:181210f8f9c7)")
Loading

What should be the proper output if we use oras cp to copy it with mounting support?

Option 1

Copying 181210f8f9c7 blob
Copying 44136fa355b3 application/vnd.oci.empty.v1+json
Mounted 181210f8f9c7 blob
Mounted 44136fa355b3 application/vnd.oci.empty.v1+json
Copying d9c5ac6a727e application/vnd.oci.image.manifest.v1+json
Copied  d9c5ac6a727e application/vnd.oci.image.manifest.v1+json
Copied [registry] localhost:5000/src/repo:test => [registry] localhost:5000/dest/repo:test
Digest: sha256:d9c5ac6a727e800e3d1403ef19fa28bb78b4aae059da4595387077fd7655bf32

Option 2

Mounted 181210f8f9c7 blob
Mounted 44136fa355b3 application/vnd.oci.empty.v1+json
Copying d9c5ac6a727e application/vnd.oci.image.manifest.v1+json
Copied  d9c5ac6a727e application/vnd.oci.image.manifest.v1+json
Copied [registry] localhost:5000/src/repo:test => [registry] localhost:5000/dest/repo:test
Digest: sha256:d9c5ac6a727e800e3d1403ef19fa28bb78b4aae059da4595387077fd7655bf32

If the latter is preferred, then the implementation of oras-go can skip calling PreCopy() for mounted blobs.

The option 2 looks concise to me.
IMHO "mounting" is not part of "copying", if "copying" represents "client-side copying".

@shizhMSFT
Copy link
Contributor

shizhMSFT commented Oct 10, 2023

@Wwwsylvia It seems that we are over complicating stuffs. Generally, the client should attempt mounting only if the client has the knowledge of the existence of the layers in other repositories of the same registry instance where this prerequisite is prefect for oras.Copy(). This behaviour can also be observed from docker as docker daemon maintains a metadata database to store that knowledge. The hard part is how to determine if source and destination targets points to the same registry instance. Unfortunately, oras-go does not have that context but the application does. Therefore, exposing an option of AttemptMount in CopyGraphOptions and let the application to set it sounds a better choice.

@qweeah Option 2 LGTM.

@ktarplee
Copy link
Contributor

ktarplee commented Oct 10, 2023

To add to @shizhMSFT good point. Skopeo/podman keep a "blob info cache" (boltdb on disk I believe) that any operation they do updates the blob info cache. They consult the blob info cache every time they try to push a blob to see if they know of a repository on that registry where the blob is already present. It does not need to be the source repository. You can be copying from other.com/a/b to example.com/c/d but mount a blob from example.com/f/e. This does motivate the "AttemptMount" approach. So to be clear the proposal is to add a field to the CopyGraphOptions struct

type CopyGraphOptions struct {
    // existing fields
    
    // AttemptMount attempts to mount the current descriptor.  Can be used to avoid an expensive copy. 
    AttemptMount func(ctx context.Context, desc ocispec.Descriptor) error
}

Then maybe oras-go can provide a simple implementation, something like:

package registry

func AttemptMountFromSource(src *registry.Repository, dst *registry.Repository) func (ctx context.Context, desc ocispec.Descriptor) error {
    return func(ctx context.Context, desc ocispec.Descriptor) error {
	if src.Reference.Registry != dst.Registry {
		 return nil
	}

	if IsManifest(desc.MediaType) {
		// we do not want to try mounting manifests of any type
		return nil
	}

	// Trying mount
	err := dst.Mount(ctx, desc, src.Reference.Repository, nil)
	if err != nil {
		// ignoring mount error
	}
	// Mount succeeded
	return nil
}

Then applications can do more extensive blob tracking and hopefully be able to mount more often.

I think AttemptMount() is not quite the right name because it either mounts or copies. The reason I say this is that dst.Mount() above has the post-condition that it either mounts the data or it copies. That is how Mount() is defined. So really AttemptMount is not possible with the current definition of Mount() (unless we have getContent() return an error and handle it specially). It is just a Copy function. So maybe we add a field called Copy to CopyGraphOptions that allows the user to overwrite the copy behavior by trying a mount first. Now the question comes up as to how the UI (ORAS CLI) will know if a mount occurred vs it being copied.

@Wwwsylvia
Copy link
Member Author

I think AttemptMount() is not quite the right name because it either mounts or copies. The reason I say this is that dst.Mount() above has the post-condition that it either mounts the data or it copies. That is how Mount() is defined. So really AttemptMount is not possible with the current definition of Mount() (unless we have getContent() return an error and handle it specially). It is just a Copy function. So maybe we add a field called Copy to CopyGraphOptions that allows the user to overwrite the copy behavior by trying a mount first. Now the question comes up as to how the UI (ORAS CLI) will know if a mount occurred vs it being copied.

Oh I forgot that Repository.Mount automatically copies the content if it fails to mount, so the caller can't tell if the content is mounted or copied from the return value.
The invocation of getContent indicates that the content is copied, hacking around it would work but may not be a good solution. Need to think about it.

@ktarplee
Copy link
Contributor

ktarplee commented Oct 13, 2023

@Wwwsylvia I suspect you are not interested in changing the Mounter interface to be TryMount(). Given that you probably want no extra API changes besides an addition to the the callback function in the CopyGraphOptions struct we have limited options.

One approach would be to allow getContent() (passed to Mount()) to return a special error like errdef.ErrUnsupported in which case Mount() checks for that error, and short circuits (returns). Mount() then effectively is a TryMount() when getContent = func() { return errdef.Unsupported }. This approach is similar to os.Walk() and the sentinel error os.SkipDir. Note that we cannot use getContent == nil because that is already defined as the condition that causes the Registry.Mount() to perform a simple copy. I think that might have been the wrong design decision, but that cannot be changed at this point without breaking the API.

@Wwwsylvia Wwwsylvia assigned ktarplee and unassigned Wwwsylvia Oct 17, 2023
@Wwwsylvia
Copy link
Member Author

@ktarplee Sorry for replying late, was busy with other stuffs.
The reason why Mount was implemented to do "mount or copy" instead of "try mount or fail" was that it can save a POST request.

Requests involved in "mount or copy" approach:

  1. POST /v2/<name>/blobs/uploads/?mount=<digest>&from=<other_name> (Mount: try mount but fail)
  2. GET /v2/<name>/blobs/<digest> (Mount: get content)
  3. PUT <location>?digest=<digest> (Mount: complete upload)

Requests involved in "try mount or fail" plus "copy" approach:

  1. POST /v2/<name>/blobs/uploads/?mount=<digest>&from=<other_name> (TryMount: try mount but fail)
  2. GET /v2/<name>/blobs/<digest> (copyNode: get content)
  3. POST /v2/<name>/blobs/uploads/ (copyNode: initiate an upload session)
  4. PUT <location>?digest=<digest> (copyNode: complete upload)

Our design should take the following requirements into account:

  1. ORAS CLI needs to print "Mounted xxx" on successful mount
  2. ORAS CLI needs to print "Copying xxx" before copy and "Copied xxx" after copy
  3. Keep the request count minimum

@shizhMSFT and I just discussed offline, and here is a POC we currently have:

package oras

type CopyGraphOptions struct {
	// MountPoint returns the repository name of the mounting source.
	// If nil, no mounting will be attempted.
	MountPoint func(ctx context.Context, desc ocispec.Descriptor) (string, error)
	// OnMounted will be invoked when desc is mounted.
	OnMounted func(ctx context.Context, desc ocispec.Descriptor) error
}

func mountOrCopyNode(ctx context.Context, src content.ReadOnlyStorage, dst content.Storage, desc ocispec.Descriptor, opts CopyGraphOptions) error {
	// copy if mount is not applicable
	if descriptor.IsManifest(desc) {
		return copyNode(ctx, src, dst, desc, opts)
	}
	if opts.MountPoint == nil {
		return copyNode(ctx, src, dst, desc, opts)
	}
	mounter, ok := dst.(registry.Mounter)
	if !ok {
		return copyNode(ctx, src, dst, desc, opts)
	}

	// try mount
	fromRepo, err := opts.MountPoint(ctx, desc)
	if err != nil {
		return err
	}
	var mountFailed bool
	getContent := func() (io.ReadCloser, error) {
		if opts.PreCopy != nil {
			if err := opts.PreCopy(ctx, desc); err != nil {
				return nil, err
			}
		}
		// the invocation of getContent indicates that mounting is failed
		mountFailed = true
		return src.Fetch(ctx, desc)
	}
	if err := mounter.Mount(ctx, desc, fromRepo, getContent); err != nil {
		// ignore mounting error and fallback to copy
		if err := doCopyNode(ctx, src, dst, desc); err != nil {
			return err
		}
	}
	if !mountFailed {
		// successfully mounted
		if opts.OnMounted != nil {
			return opts.OnMounted(ctx, desc)
		}
		return nil
	}

	// the node is copied instead of mounted
	if opts.PostCopy != nil {
		return opts.PostCopy(ctx, desc)
	}
	return nil
}

The idea is that we can leave it to the caller to determine whether to mount and where to mount from.
With MountPoint, applications that maintain a blob metadata DB can do something like this:

func Example() {
	src := NewLocalTarget()
	dst := NewRemoteTarget(ref)
	db := LoadLocalMetadataDB()

	opts := CopyGraphOptions{
		MountPoint: func(ctx context.Context, desc ocispec.Descriptor) (string, error) {
			refs := db.Query(desc)
			refs = refs.Filter(ref.Registry)
			if len(refs) == 0 {
				return "", errdef.ErrNotFound
			}
			return refs[0], nil
		},
	}
	oras.CopyGraph(ctx, src, dst, root, opts)
}

And regular workflow that simply mounts from src to dst can be done with:

func Example() {
	src := NewRemoteTarget(srcRef)
	dst := NewRemoteTarget(dstRef)

	opts := CopyGraphOptions{
		MountPoint: func(ctx context.Context, desc ocispec.Descriptor) (string, error) {
			return srcRef.Repository, nil
		},
	}
	oras.CopyGraph(ctx, src, dst, root, opts)
}

@ktarplee
Copy link
Contributor

ktarplee commented Oct 21, 2023

@Wwwsylvia I think what are proposing does meet your criterion and if I have time I will work on implementing it. However it does make one assumption that I want to call out (not that we should change anything). That is, it is tightly coupled to the OCI distribution mount specification and that seems overly restrictive for a generic oras.CopyGraph function. Specifically MountPoint() returns the repository to mount from and thus assuming the same registry and that the registry mount API will be used. Realizing that oras.CopyGraph() is a generic graph copy algorithm is what makes this slightly concerning.

It is easy to conceive of ways to do a more generalized mount from a registry implementation perspective saving upload bandwidth of the server or client. For example, a registry might provide a non-standard API to allow clients to provide an IPFS address to pull content from. Or if the registry is already backed by IPFS then the operation is a no-op (if the registry trusts that the blob will stay around in IPFS). If the blob is stored in object storage (shared by the registry) then there might be a way to "mount" that via the storage providers API. I slightly less far fetched idea is if a infrastructure provider provides many registries all backed by the same storage. Then mounting between those registries is a zero-copy operation and would benefit from a generalized approach specific to that registry. I realize these are a little far fetched however this design does not allow the necessary flexibility for someone to tie these into ORAS.

In comparison, the hack I proposed at the beginning of this issue does have this property since mounting can be implemented in the PreCopy callback as shown. The only change to oras would be that you export the error errSkipDesc = errors.New("skip descriptor")so a user implementing PreCopy to do a mount can return that error. Just to be clear I am not promoting my initial (hack) approach. I am just drawing out the comparison.

@Wwwsylvia
Copy link
Member Author

@ktarplee Thanks for the insights!

. That is, it is tightly coupled to the OCI distribution mount specification and that seems overly restrictive for a generic oras.CopyGraph function. Specifically MountPoint() returns the repository to mount from and thus assuming the same registry and that the registry mount API will be used. Realizing that oras.CopyGraph() is a generic graph copy algorithm is what makes this slightly concerning.

Your concern makes sense. Indeed MountPoint looks a bit of too registry-specific compared to other generic options for CopyGraph. The goal and also the challenge here is to provide both flexibility and usability, that is, to enable users to do more cool stuffs with less code.
Do you have suggestions on providing a generic interface allowing users to leverage the benefits of mounting without worrying about the implementation details?

In comparison, the hack I proposed at the beginning of this issue does have this property since mounting can be implemented in the PreCopy callback as shown. The only change to oras would be that you export the error errSkipDesc = errors.New("skip descriptor")so a user implementing PreCopy to do a mount can return that error.

As long as we have a use case, exporting ErrSkipDesc to enable flexibility sounds good to me. We previously kept it as errSkipDesc to prevent potential misuse, but it should be ok if we can provide a clear documentation.

cc @shizhMSFT for comments.

@Wwwsylvia
Copy link
Member Author

Hi @ktarplee, if there is no other concern/option, would you like to implement the MountPoint design and also export ErrSkipDesc?

@ktarplee
Copy link
Contributor

ktarplee commented Oct 31, 2023

Hi @ktarplee, if there is no other concern/option, would you like to implement the MountPoint design and also export ErrSkipDesc?

I have already implemented ErrSkipDesc. I am working on a method for CopyGraphOptions called WithMount that will wrap PreCopy (much like FilterAnnotation, etc). I don't see any need to add a registry-centric MountPoint field to CopyGraphOptions but I might change my mind as I implement this. I will try to finish this up today so you can see it.

@ktarplee
Copy link
Contributor

ktarplee commented Oct 31, 2023

@Wwwsylvia I think #631 should be merged regardless of which direction we decide to go for more robust support of mounting. At least users of ORAS can do what they want w.r.t. supporting mounting. Without #631 users cannot support mounting with oras.Copy because even if the mount succeeds the copy code still copies the blob needlessly.

Wwwsylvia pushed a commit that referenced this issue Nov 3, 2023
Export `ErrSkipDesc` as `ErrSkipDesc` allows users of the library to
implement `CopyGraphOptions.PreCopy` to call `Mount()` and then return
`ErrSkipDesc` from `PreCopy` which bypasses the downstream copy
operation.

Closes #580
Signed-off-by: Kyle M. Tarplee <[email protected]>
@Wwwsylvia Wwwsylvia reopened this Nov 3, 2023
@ktarplee
Copy link
Contributor

ktarplee commented Nov 7, 2023

Another optimization that has not been considered yet (nor implemented in #632 is...
For a given target registry, if a cross-repo blob mount is not possible for a given blob then it is likely that the feature is not supported by the registry. In which case all subsequent attempts to mount a blob will fail. Therefore once a cross repo blob mount fails, we need to record that the registry does not support it and not try for subsequent blobs. Can anyone think of a reason why a registry would would be able to mount some blobs but not others given a fixed source and destination?

@ktarplee
Copy link
Contributor

ktarplee commented Nov 8, 2023

Another optimization that has not been considered yet (nor implemented in #632 is... For a given target registry, if a cross-repo blob mount is not possible for a given blob then it is likely that the feature is not supported by the registry. In which case all subsequent attempts to mount a blob will fail. Therefore once a cross repo blob mount fails, we need to record that the registry does not support it and not try for subsequent blobs. Can anyone think of a reason why a registry would would be able to mount some blobs but not others given a fixed source and destination?

@shizhMSFT Correctly pointed out that the same number of requests is made in either case (mount support or mount unsupported), so we do not need to worry about storing the registries's ability to mount. We can safely and efficiently try a mount in every case (since the fallback is not an extra HTTP request but rather one that is needed anyway).

@ktarplee
Copy link
Contributor

ktarplee commented Jan 2, 2024

Since we have oras.SkipNode that can be returned from CopyGraphOptions.PreCopy the consumer of oras-go can implement any crazy mounting optimizations they desire (including non-standard API requests). Therefore the mounting functionality that needs to be added to oras-go only need to handle the following cases (to be the most general case possible, excluding the above extreme case handled separately out of necessity).

  1. Mount all blobs from a single source repo
  2. Mount different blobs from different source repos
  3. Mount blobs from one of many source repos by trying to mount them one at a time. Copying a blob is expensive so if there is a chance that the mount might succeed from at least one of a few possible sources then it is still useful to attempt to do that.

Obviously in all cases, if the mount(s) fail then we need to copy.

shizhMSFT pushed a commit that referenced this issue Jan 10, 2024
Adds MountFrom and OnMounted to CopyGraphOptions.
Allows for trying to mount from multiple repositories.

Closes #580 

I think this is a better approach than my other PR #632

Signed-off-by: Kyle M. Tarplee <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment