-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expose Download Metadata for Every Chunk #73
Conversation
} | ||
|
||
impl DownloadHandle { | ||
/// Object metadata | ||
pub fn object_meta(&self) -> &ObjectMetadata { | ||
&self.object_meta | ||
pub async fn object_meta(&self) -> Result<&ObjectMetadata, error::Error> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my gut tells me we'd be better off with an API that mirrors what we're doing under the hood, and we can optionally wrap that with a "nice" API that hides certain details we don't think basic users need
So: Our API has an iterator, that iterates over each request/response in the operation, whether or not it has a body.
- So if we did a HeadObject, the first item is HeadObject.
- If Content-Length was 0, there will be no more items
- Otherwise, there will be 1+ GetObject items afterwards.
- If we do discovery via GetObject, then there will simply be 1+ GetObject items.
- It is what it is
We can wrap that in a "nice" API where there are different getters for "Object Metadata" vs body chunks, and in that wrapper we can deal with the complexity of trying to differentiate the discovery request from the first-chunk request, and whether or not they're actually the same request.
Just imagine the headache our advanced users are going to have trying to "unwrap" the complexity we bake into the base API. If they're logging metrics from the stream of HTTP calls, they'd need different branches for discover vs body chunks, and they'd need to figure out whether the 1st chunk request was actually the same as the discovery request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
they'd need to figure out whether the 1st chunk request was actually the same as the discovery request.
With the current API, it's straightforward because the object metadata will contain request_id
only if we perform a HeadObject
and not a GetObject
. The GetObject
will always be delivered from the body iterator.
Our API has an iterator, that iterates over each request/response in the operation, whether or not it has a body.
I think the current design more closely mirrors what we are actually doing, which is a separate/specific discovery request followed by N parallel GetObject
operations. I thought about the iterator-based approach but didn't implement it due to the following considerations:
- I think it won't be what our users expect. They will expect N part-size chunks, and receiving a first chunk that is just a
HeadObject
is not that interesting. - In the future, we might just get rid of
HeadObject
and perform a 1-byteGetObject
request to retrieve the metadata (to support presign cases, etc.). In that case, do we simply expose that metadata as it is? That would be confusing for customers. - What about the empty file use case? Will we deliver two requests with empty bodies? The first
GetObject
might fail; will we expose this failure?
An iterator seems nice and simple, but I think it will make hiding the complexity harder as we go along. Advanced users should have the ability to get the information they need, but we also need the ability to hide certain complexities from the users.
I think the approach of if we make any request solely for discovery, we will deliver its metadata in object_meta
and we will deliver N GetObject
requests with their metadata, is a nice compromise. It lets us hide the complexity of discovery, and we can change it as needed in the future. I'd like to hear what others think about this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see both points but I'm inclined to not go with an iterator API like this (at least not publicly).
Just imagine the headache our advanced users are going to have trying to "unwrap" the complexity we bake into the base API. If they're logging metrics from the stream of HTTP calls, they'd need different branches for discover vs body chunks, and they'd need to figure out whether the 1st chunk request was actually the same as the discovery request.
This need not necessarily be true depending on how we expose metrics. I get the point you're trying to make though.
I think this is something we may need to solicit some feedback on FWIW and see what our advanced customers have in mind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, might be a mistake confusing metadata and metrics/telemetry.
Metadata is for a customer that needs to know some detail about a request/response that we didn't know they were interested in (e.g. part checksums).
Metrics/telemetry would include stuff like failed requests that got retried. And the use case for this is customers needing to look under the hood to diagnose bugs and performance issues (e.g. awslabs/mountpoint-s3#1079).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we don't even want to call this metadata, it's all the fields that can be set/returned on an object that isn't the object content itself. In SDKs this is all just part of e.g. GetObjectResponse
but here we have to give it a name because we aren't exposing it the same way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed offline. Currently, we will be moving forward with the current API. We will consider an iterator-like API for telemetry in the future, which will require us to hook into the SDK's interceptor so that we can get all the requests, including retries, etc., and expose that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good start. Still mulling some of this over but left some questions and suggestions.
@@ -171,14 +171,14 @@ async fn do_download(args: Args) -> Result<(), BoxError> { | |||
// TODO(aws-sdk-rust#1159) - rewrite this less naively, | |||
// likely abstract this into performant utils for single file download. Higher level | |||
// TM will handle it's own thread pool for filesystem work | |||
let mut handle = tm.download().bucket(bucket).key(key).send().await?; | |||
let mut handle = tm.download().bucket(bucket).key(key).send()?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
discuss: If we are changing send()
to not do any work or be async then perhaps we should change the name (e.g. initiate
). I only bring this up because we mirrored the Rust SDK API a bit here and wondering whether it will cause any confusion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, updated. I have added a TODO to make it consistent across the board. I will create a follow-up PR to rename upload from send to initiate as well.
/// this data via [`impl Buf`](bytes::Buf) or it can be copied into contiguous storage with | ||
/// [`.into_bytes()`](crate::byte_stream::AggregatedBytes::into_bytes). | ||
#[derive(Debug, Clone)] | ||
pub struct AggregatedBytes(SegmentedBuf<Bytes>); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: Move to io
module
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also this is maybe fine for now but is this how we want to expose the data/chunks? I originally did it via AggregatedBytes
since that was basically the only option to do what we wanted with the API given from the SDK without introducing a new API (which we are now doing and should).
I don't have a suggestion at the moment but it bears consideration since we are effectively taking control over the API and that may give us different options than originally considered in the initial "naive" implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, moved to io. AggregatedBytes
seems fine to me for exposing the body. As discussed offline, we can batch up multiple ChunkResponse as a future optimization.
pub object_lock_mode: Option<aws_sdk_s3::types::ObjectLockMode>, | ||
pub object_lock_retain_until_date: Option<::aws_smithy_types::DateTime>, | ||
pub object_lock_legal_hold_status: Option<aws_sdk_s3::types::ObjectLockLegalHoldStatus>, | ||
|
||
/// The request_id if the client made a request to retrieve object metadata, such as with HeadObject. | ||
pub request_id: Option<String>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we're going to want to hide this and make it similar to the way the SDK exposes this for future evolution.
You can see the RFC here for how the SDK did this.
Example implementation and supporting traits:
- S3 extended request ID: trait
- Output hides details and exposes hides details and exposes via trait
We do something similar with errors (error metadata and implementation)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also related is this todo and that we need to expose metadata for errors as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I have updated it to use the same trait. I have not looked into exposing metadata for errors yet. I will keep it as a TODO for now for a future PR, since I want to get this merged first to unblock other people.
object_lock_retain_until_date: value.object_lock_retain_until_date, | ||
object_lock_legal_hold_status: value.object_lock_legal_hold_status, | ||
object_lock_legal_hold_status: value.object_lock_legal_hold_status.clone(), | ||
request_id: None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
correctness: GetObjectOutput
should/may have these
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I have added these as for GetObjectOutput as well.
// TODO(aws-sdk-rust#1159, design) - consider PartialOrd for ChunkResponse and hiding `seq` as internal only detail | ||
// the seq number | ||
pub(crate) seq: u64, | ||
/// data: body of the object |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: this is a public doc comment and will show up in the generated documentation. We should be mindful of this and provide properly formatted and useful docs. You can always check generated docs with cargo doc --no-deps --all-features --open
/// The content associated with this particular part/range request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, updated.
/// request metadata other than the body that will be set from `GetObject` | ||
// TODO: Document fields | ||
#[derive(Debug, Clone, Default)] | ||
pub struct ChunkMetadata { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question/discuss: Do we need two types here (ChunkMetadata
and ObjectMetadata
)? They are effectively the same thing just populated slightly differently right? Is there any value to us or the customer in having to wrap their head around two different types? i.e. another way to look at this would be ObjectMetadata
and only fields that are available for a given request type are populated (since most fields are optional anyway).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have discussed it a bit offline, but yeah, I think two types here make more sense with respect to future adaptability. From experience with the CRT S3 client, it's just a pain and confusing trying to map different response outputs to a single class. ObjectMetadata
is something that we constructed with some manual decisions around what to expose and how. Some fields might be available but might not make sense at an object level, like checksums. It also keeps us more flexible if we want to add some metadata in the future at different levels, like maybe numChunks (different from partsCount that S3 provides) at an object level, etc.
} | ||
|
||
impl DownloadHandle { | ||
/// Object metadata | ||
pub fn object_meta(&self) -> &ObjectMetadata { | ||
&self.object_meta | ||
pub async fn object_meta(&self) -> Result<&ObjectMetadata, error::Error> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see both points but I'm inclined to not go with an iterator API like this (at least not publicly).
Just imagine the headache our advanced users are going to have trying to "unwrap" the complexity we bake into the base API. If they're logging metrics from the stream of HTTP calls, they'd need different branches for discover vs body chunks, and they'd need to figure out whether the 1st chunk request was actually the same as the discovery request.
This need not necessarily be true depending on how we expose metrics. I get the point you're trying to make though.
I think this is something we may need to solicit some feedback on FWIW and see what our advanced customers have in mind.
pub async fn next(&mut self) -> Option<Result<AggregatedBytes, crate::error::Error>> { | ||
// TODO(aws-sdk-rust#1159, design) - do we want ChunkResponse (or similar) rather than AggregatedBytes? Would | ||
// make additional retries of an individual chunk/part more feasible (though theoretically already exhausted retries) | ||
pub async fn next(&mut self) -> Option<Result<ChunkResponse, crate::error::Error>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing that comes to mind with this API is we only yield one chunk at a time when (now that we own the AggregatedBytes
type) we could be yielding many chunks at once all stitched together into a single AggregatedBytes
.
I realize the point of the change we're making is to expose metadata per/chunk. I'm wondering though if we shouldn't also consider a batch recv API or an API that combines multiple chunks into one and returns as many as we have sequenced available. There is a possibility this makes a difference depending on how fast/slow a customer is processing data. Would have to benchmark I suppose. We can wait of course but it's worth bringing up for discussion now to see what people think and add TODO's maybe if we agree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's pretty reasonable, IMO, something like collect in ByteStream
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, yeah, it makes sense, but we will need to benchmark it to see if there is any real-world performance gain. We should probably do it once we have a stricter global number-of-parts-in-memory scheduler in place because I think it will matter then. Currently, every download has its own queue, which is very large, so it might not matter much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix and ship
@@ -3,6 +3,8 @@ | |||
* SPDX-License-Identifier: Apache-2.0 | |||
*/ | |||
|
|||
/// Download Body Type | |||
pub mod aggregated_bytes; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to make the module public, just export it:
mod aggregated_bytes;
pub use aggregated_bytes::AggregatedBytes;
This makes the type visible to users via the path aws_s3_transfer_manager::io::AggregatedBytes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, updated.
|
||
// make initial discovery about the object size, metadata, possibly first chunk | ||
let mut discovery = discover_obj(&ctx, &input).await?; | ||
let _ = object_meta_tx.send(discovery.object_meta); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are we ignoring the result here? If we never expect this to fail it should be used with an expect
so that if we ever do violate the assumptions we're making it fails loudly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will only fail if we have already dropped the handle at this point. The discovery task is a join handle and would be in detached state. While it’s possible for this to fail, we should not panic on it. We can handle the cancellation gracefully once we implement the cancellation logic for downloads. @ysaito1001, What are your thoughts since you are looking into cancellation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Glad you asked. I'd say don't worry about using _
at line 121 in this PR, so no action needed. My next PR of cancelling download(s) should be a good place to handle the error properly.
|
||
use crate::error::{self, ErrorKind}; | ||
use tokio::{ | ||
sync::{oneshot::Receiver, Mutex, OnceCell}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tokio'sMutex
is fairly heavyweight, unless we're holding a lock across an await
point we should prefer stdlib Mutex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we are holding the lock across an await point in the join
function because we need to acquire the lock and join all the tasks.
@@ -20,19 +20,25 @@ mod handle; | |||
pub use handle::DownloadHandle; | |||
use tracing::Instrument; | |||
|
|||
mod object_meta; | |||
/// Provides metadata for each chunk during an object download. | |||
pub mod chunk_meta; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again we probably don't want to expose all of the various modules and instead curate a public API that is more thought out (e.g. via pub use
).
Our other operations define an output
module with the output type(s) of the operation. Both chunk and object metadata are output/response types so perhaps that makes sense here for consistency?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have updated it to pub use
. Initially, I started with renaming body
to output
to keep it consistent but later reverted that as body
made more sense than output
. We can revisit this later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. We can consider adding an integration test for verifying object_meta
on DownloadHandle
in a future PR, if we don't have one.
@@ -52,7 +58,7 @@ impl Download { | |||
/// "follow from" the current span, but be under their own root of the trace tree. | |||
/// Use this for `TransferManager.download().send()`, where the spawned tasks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// Use this for `TransferManager.download().send()`, where the spawned tasks | |
/// Use this for `TransferManager.download().initiate()`, where the spawned tasks |
Just for consistency, even if we rename .initiate
in a future PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I did update the docs for download but missed it here. I have updated it.
} | ||
|
||
impl From<GetObjectOutput> for ObjectMetadata { | ||
fn from(value: GetObjectOutput) -> Self { | ||
impl From<&GetObjectOutput> for ObjectMetadata { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, because GetObjectOutput
is not cloneable (due to its body
of type ByteStream
not cloneable), we need one of the conversions (in this case GetObjectOutput
-> ObjectMetadata
) to start from a reference in case we want to create both ObjectMetadata
and ChunkMetadata
from a single GetObjectOutput
.
Might be worth adding comments (I believe this stems from we have two separate types ObjectMetadata
and ChunkMetadata
convertible from GetObjectOutput
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I have added comments here to explain this.
let (object_meta_tx, object_meta_rx) = oneshot::channel(); | ||
|
||
let tasks = Arc::new(Mutex::new(JoinSet::new())); | ||
let discovery = tokio::spawn(send_discovery( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pretty neat, delegating async-related preparatory steps (like acquiring permit) to a separate task so orchestrate
can return early. tasks
needs to be in Arc
because of that, but feels like that's a necessary change.
Description of changes:
This PR refactors the download API to achieve the following objectives. (We will refactor the Upload API, similarly, once this PR is approved.)
putObject
vs.multipartUpload
, etc in the future.object_meta
and all the GetObject metadata directly as chunk metadata. It also allows us to expose more things in the future, like the request we made, etc., if required.Design Discussions:
The discovery phase can be HeadObject or GetObject. In the case of GetObject, we are exposing the request ID as part of the first chunk metadata. I was not really sure how to expose the HeadObject metadata; some considerations were such as having the HeadObject metadata as the first empty chunk. I exposed it in the
object_metadata
. The request ID inobject_metadata
will always be set and duplicated for GetObject.What and how to expose object_metadata? I have tried to expose everything that made sense at the object level. Currently, I am using Tokio's OnceCell + oneshot channel to expose the
object_metadata
. It's not too bad, but I am not sure if we want to use Tokio-specific utilities. Please feel free to suggest if there are better options I can use or ways to simplify this.I have added a TODO that we should abort the download if downloading a chunk fails, as I think it's not happening right now.
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.