-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
download objects operation #45
Conversation
let total_bytes_transferred = self | ||
.ctx | ||
.state | ||
.total_bytes_transferred | ||
.load(Ordering::SeqCst); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider making an accessor for this.
let root_dir = ctx.state.input.destination().expect("destination set"); | ||
let key = input.key.as_ref().expect("key set"); | ||
let prefix = ctx.state.input.key_prefix(); | ||
let delim = ctx.state.input.delimiter(); | ||
|
||
let key_path = local_key_path(root_dir, key.as_str(), prefix, delim)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you're grabbing all these fields just to call local_key_path
, could you make this a method on cty.state
or something? It would shorten this method.
aws-s3-transfer-manager/src/operation/download_objects/worker.rs
Outdated
Show resolved
Hide resolved
fn strip_key_prefix<'a>(key: &'a str, prefix: Option<&str>, delimiter: Option<&str>) -> &'a str { | ||
let prefix = prefix.unwrap_or(""); | ||
let delim = delimiter.unwrap_or(DEFAULT_DELIMITER); | ||
|
||
if key.is_empty() || prefix.is_empty() || !key.starts_with(prefix) || !key.contains(delim) { | ||
return key; | ||
} | ||
|
||
let stripped = &key[prefix.len()..]; | ||
|
||
if prefix.ends_with(delim) || !stripped.starts_with(delim) { | ||
return stripped; | ||
} | ||
|
||
&stripped[1..] | ||
} | ||
|
||
/// Replace `delimiter` in `key` if it does not match the `path_separator` | ||
fn replace_delim<'a>(key: &'a str, delimiter: Option<&str>, path_separator: &str) -> Cow<'a, str> { | ||
match delimiter { | ||
Some(delim) if delim != path_separator => { | ||
let replaced = key.replace(delim, path_separator); | ||
Cow::Owned(replaced) | ||
} | ||
_ => Cow::Borrowed(key), | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could be nice to fuzz these to make sure we handle weird characters as expected.
aws-s3-transfer-manager/src/operation/download_objects/worker.rs
Outdated
Show resolved
Hide resolved
let prefix = prefix.unwrap_or(""); | ||
let delim = delimiter.unwrap_or(DEFAULT_DELIMITER); | ||
|
||
if key.is_empty() || prefix.is_empty() || !key.starts_with(prefix) || !key.contains(delim) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 things:
!key.starts_with(prefix)
: wouldn't it be a bug if this were true? should we panic/err if that happens, vs trying to handle it?!key.contains(delim)
: I tested with the java transfer manager and if key="photo-1" and prefix="photo-" then actual="1". I don't necessarily agree that it should work this way, but that's how it works for java.- It feels weird to me that the transfer manager SEP forces a delimiter on the end of an upload prefix, but doesn't force a delimiter at the end of a download prefix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Probably yes but that is what Java does
- Yes and I believe this is covered by unit tests. I can beef these up further if you think we need additional tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would expect the following behavior, you should probably add tests
key="many///delims-in-a-row" prefix="" delim="/" -> "many/delims-in-a-row"
key="//delims-at-start" prefix="" delim="/" -> "delims-at-start"
key="prefix///then-delims" prefix="prefix" delim="/" -> "then-delims"
key="prefix///then-delims" prefix="prefix/" delim="/" -> "then-delims"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those last 3 are not valid in the Java S3 TM:
Caused by: software.amazon.awssdk.core.exception.SdkClientException: Cannot download key //delims-at-start, its relative path resolves outside the parent directory.
Caused by: software.amazon.awssdk.core.exception.SdkClientException: Cannot download key prefix1///then-delims, its relative path resolves outside the parent directory.
Caused by: software.amazon.awssdk.core.exception.SdkClientException: Cannot download key prefix1///then-delims, its relative path resolves outside the parent directory.
The logic implies that if the relative path after stripping the prefix is an absolute path then it's treated as outside of the destination directory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not saying the Java S3 TM is perfect. I'm saying the SEP doesn't say what to do about multiple delimiters in a row, so we get inconsistent behavior like:
"prefix/then/later//double-delims" -> "then/laser/double-delims"
OK, after removing prefix and doing path_clean
"prefix//double-delims" -> "/double-delims"
BAD, after removing prefix it looks like an absolute path
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with a TODO about the SEP needing more guidance on edge cases
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix (prefix/delim validation) and ship
.set_bucket(input.bucket.to_owned()) | ||
.set_prefix(prefix.clone()) | ||
.set_continuation_token(next_token.clone()) | ||
.set_delimiter(input.delimiter.to_owned()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in upload_objects I added a recursive
bool, defaulting to false, per the SEP. Java TM doesn't have this option, and neither does download_objects in this PR. But if download_objects DID support it, we'd probably want to pass delimiter here and then NOT recurse ?
anyway, I'm adding a TODO about this inconsistency in the upload_objects scaffolding PR. We'll figure it out eventually
Issue #, if available:
closes: #8
Description of changes:
Adds support for the
download_objects
operation which recursively downloads a bucket (or bucket prefix) to a local directory.By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.