-
Notifications
You must be signed in to change notification settings - Fork 361
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix S3 gateway cross-repo copies #7468
Conversation
putobject and friends were confusing src and dst repositories, which will not do. Fixes #7467.
Reviewers: Adding relevant test to Esti, but please review the change for now. |
Actually... testing this may not be possible with the current S3 gateway test, which uses the MinIO client, which I am not sure supports the multipart copies API. May take a while, will open a tech-debt issue if I cannot get this done here today. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@arielshaqed Thanks, code is great no comments on the changes. However, can you please add a test scenario for that
Thanks! There are some issues with testing, I'm still trying to figure out if I can use MinIO client ComposeObject to do this. |
@arielshaqed for that specific test you can use an S3 client and limit the test for s3 blockstore only, though I think minio does support multipart upload |
37b3281
to
1dcdf87
Compare
We might only have a problem with sigV2, which is considerably less important by now.
This should make MPU copies that are signed using sigV2 work with the S3 gateway.
Otherwise infinite loop on upload >:-(
Exercise another code path, don't triple data size, and keep the originally intended logic.
Repositories not always deleted, so if we don't do this we get a conflict.
@guy-har maybe you know? I managed to get Esti to pass, except for...
|
df0ada8
to
2d1fdf6
Compare
copyPartRange should stage a block on the destination. This matters When source and destination containers are different.
2d1fdf6
to
02411e1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@N-o-Z I added an integration test to Esti. That was a great call because it uncovered an issue with the Azure block adapter! So I fixed that one too. Please re-re-review.
@guy-har I'd love to have your review of everything, especially of the changes to that Azure block adapter :-)
Please note that sigv2 still doesn't work. Opened #7472 to do that. But we should pull anyway: nobody should use sigV2, it's really hard to do nowadays, it solves the original problem for Boto3/Python users who are currently complaining, and lakeFS with this code is better than without it!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks Great!
Thanks!
esti/s3_gateway_test.go
Outdated
@@ -328,6 +328,89 @@ func TestS3HeadBucket(t *testing.T) { | |||
}) | |||
} | |||
|
|||
func TestS3CopyObjectMultipart(t *testing.T) { | |||
const minDataContentLengthForMultipart = 5 << 20 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: it's a bit confusing that minDataContentLengthForMultipart
and largeDataContentLength
exist in different files and only used here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding the test!!
Some optional suggestion
@@ -328,6 +328,88 @@ func TestS3HeadBucket(t *testing.T) { | |||
}) | |||
} | |||
|
|||
func TestS3CopyObjectMultipart(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
: 🎉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only thing tests are good for are finding bugs. And fixing them. And preventing regressions. Automatically. Not sure it's worth the effort.
Although... in this case this test actually found bugs. And helped fix one of them. And another test in the file prevented a regression. Automatically. Actually the case against writing tests has never been weaker 🥳 .
blobSizesURL := destinationContainer.NewBlockBlobClient(destinationObjName + sizeSuffix) | ||
_, err = blobSizesURL.StageBlock(ctx, base64Etag, streaming.NopCloser(strings.NewReader(sizeData)), nil) | ||
sizeData := fmt.Sprintf("%d\n", count) | ||
blobSizesBlob := destinationContainer.NewBlockBlobClient(destinationKey.BlobURL + sizeSuffix) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider moving sizeSuffix and idSuffix to this file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not done in this PR.
pkg/testutil/random.go
Outdated
} | ||
|
||
// RandomReader returns a reader that will return size bytes from rand. | ||
func RandomReader(rand *rand.Rand, size int64) io.Reader { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
func RandomReader(rand *rand.Rand, size int64) io.Reader { | |
func NewRandomReader(rand *rand.Rand, size int64) io.Reader { |
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice one, yes please!
Because it returns a new random reader, and also that matches the Go naming conventions. :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Pulling once all tests pass.
blobSizesURL := destinationContainer.NewBlockBlobClient(destinationObjName + sizeSuffix) | ||
_, err = blobSizesURL.StageBlock(ctx, base64Etag, streaming.NopCloser(strings.NewReader(sizeData)), nil) | ||
sizeData := fmt.Sprintf("%d\n", count) | ||
blobSizesBlob := destinationContainer.NewBlockBlobClient(destinationKey.BlobURL + sizeSuffix) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not done in this PR.
@@ -328,6 +328,88 @@ func TestS3HeadBucket(t *testing.T) { | |||
}) | |||
} | |||
|
|||
func TestS3CopyObjectMultipart(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only thing tests are good for are finding bugs. And fixing them. And preventing regressions. Automatically. Not sure it's worth the effort.
Although... in this case this test actually found bugs. And helped fix one of them. And another test in the file prevented a regression. Automatically. Actually the case against writing tests has never been weaker 🥳 .
pkg/testutil/random.go
Outdated
} | ||
|
||
// RandomReader returns a reader that will return size bytes from rand. | ||
func RandomReader(rand *rand.Rand, size int64) io.Reader { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice one, yes please!
license/cla stuck. But I work here, I literally signed on paper to grant the project my rights for work on lakeFS. Pulling by adminiforce. |
putobject and friends were confusing src and dst repositories, which will not do.
Fixes #7467.