Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nydusify copies the image without writing to disk #1636

Open
BruceAko opened this issue Oct 16, 2024 · 0 comments
Open

Nydusify copies the image without writing to disk #1636

BruceAko opened this issue Oct 16, 2024 · 0 comments

Comments

@BruceAko
Copy link
Contributor

BruceAko commented Oct 16, 2024

Background

When using nydusify copy to pull image from one repository to another, the current logic is: first pull and store the image file to local content store, then push from content store to remote image repository.

Objective

Copy the image without writing to disk, do not store the image blob locally (or only store metadata), to reduce IO overhead.

Research

Docker & Nerdctl

To copy an image using docker or nerdctl, we can only use nerdctl pull <source-image> to download the image, then use nerdctl tag <source-image> <target-image> to tag the image, and finally use nerdctl push <target-image> to push the image to the target repository. Neither has a no-write-to-disk implementation.

Nydusify copy with backend

Nydusify can implement source-backend-type as oss or s3 in the parameters of copy. After carefully reading this part of the code implementation, I found that there is no additional write to disk when pushing the blob from source-backend to the target registry.

Specifically, after getting the blobID of the blob to be processed, backend.Reader(blobID) gets a Reader object from the object store, getPushWriter() gets a Writer via target ref (e.g. myregistry/repo:tag-nydus-copy), and then calls content.backend. object, and then call content.Copy(), we can copyWithBuffer and avoid writing to disk.

Implementation

The nydusify copy's Push() calls the containerd's push() method, while its ultimate core is the content.

// Copy copies data with the expected digest from the reader into the
// provided content store writer. This copy commits the writer.
//
// This is useful when the digest and size are known beforehand. When
// the size or digest is unknown, these values may be empty.
//
// Copy is buffered, so no need to wrap reader in buffered io.
func Copy(ctx context.Context, cw Writer, or io.Reader, size int64, expected digest.Digest, opts ...Opt) error {}

The Copy() method copies data from Reader to Writer via copyWithBuffer.

Steps:

  • Modify the pvd.Pull() logic to get only the manifest of the image, not pulling all the layers, and get all the blobIDs to be pulled
  • Modify pvd.Push() logic to get a Reader for a blob via fetcher.Fetch(ctx context.Context, desc ocispec.Descriptor)
  • Get a Writer for a blob by getPushWriter calling pusher.Push(ctx context.Context, desc ocispec.Descriptor)
  • Call content.Copy() to make a copy without writing to disk

Documentation

Proposal: Nydusify copies the image without writing to disk

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant