-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for uploading files from a directory to S3 #67
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
waahm7
reviewed
Oct 29, 2024
graebm
reviewed
Oct 30, 2024
Co-authored-by: Waqar Ahmed Khan <[email protected]>
This commit addresses #67 (comment) #67 (comment) #67 (comment)
graebm
reviewed
Oct 30, 2024
aajtodd
reviewed
Oct 30, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall, great start!
This commit addresses #67 (comment) #67 (comment)
This commit addresses #67 (comment)
This commit addresses #67 (comment)
This commit addresses #67 (comment)
This commit addresses #67 (comment)
This commit addresses #67 (comment)
This commit addresses #67 (comment)
aajtodd
approved these changes
Nov 4, 2024
This commit addresses #67 (comment)
This commit addresses #67 (comment)
This commit responds to #67 (comment)
This commit addresses #67 (comment)
This commit addresses #67 (comment)
This commit addresses #67 (comment)
This was referenced Nov 8, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description of changes
This PR adds support for uploading files from a specified directory to S3. Example usages can be found in the example as well as in integration tests.
This implementation mirrors the pattern used for downloading multiple objects from S3, but in the opposite direction.
Given a
UploadObjectsInput
,UploadObjects::orchestrate
spawns a producer task,list_directory_contents
, and consumer tasks,upload_objects
. The producer task retrieves files from the specified target directory (non-recursively by default) and yields them to the consumer tasks, which upload them to S3 using the existing functionality for uploading a single object. AUploadObjectsHandle
manages these tasksand joining this handle will drive the async tasks.A couple of details. First, for yielding files to upload, we use the walkdir crate, which supports following symbolic links and recursive traversal natively. Since this crate offers a synchronous API, we integrate the blocking crate to handle blocking I/O tasks in a dedicated pool.
If you think this approach is unnecessary, I’m open to either retaining or removing theUsing something likeblocking
crate.spawn_blocking
would require the whole directory traversal task to be a sync function to be passed tospawn_blocking
, which complicates the use of theasync_channel
within the function (that said we leftTODO
to reevaluate the need for theblocking
crate).Second, during directory traversal, I/O errors may occur, such as being unable to read directories/files or failing to construct an
InputStream
. These client-side errors are sent to the consumer task, which records them inFailedUploadTransfer
(in which case theinput
field will beNone
for client-side errors) but we proceed with the rest of the "good" files if we employFailedTransferPolicy::Continue
. Deriving an object key from a relative filename may also fail; however, since this is a user input error, we will fail the upload operation even withFailedTransferPolicy::Continue
.By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.