Expose Download Metadata for Every Chunk #73

waahm7 · 2024-11-10T19:15:13Z

Description of changes:
This PR refactors the download API to achieve the following objectives. (We will refactor the Upload API, similarly, once this PR is approved.)

Return the handle ASAP. The discovery call can take a long time, i.e., waiting for a networking permit, etc. We should return the handle without any networking to customers. This means that customers won't get any errors until they poll metadata, or the body, but this aligns more closely to what we are trying to do and will help us do more complicated things, like reading the body first to determine putObject vs. multipartUpload, etc in the future.
Expose metadata for all HTTP calls. Customers have been asking us for more accurate metadata in the CRT client. At a minimum, we should expose the request IDs so that customers can diagnose if a part takes unusually long. This PR exposes some common metadata as object_meta and all the GetObject metadata directly as chunk metadata. It also allows us to expose more things in the future, like the request we made, etc., if required.
Replace AggregatedBytes dependency with something that we control. I have copied the implementation so that we can control it. For now, I didn't have any meaningful changes I could make to this, so it is copied as-is. Please feel free to give any suggestions.

Design Discussions:

The discovery phase can be HeadObject or GetObject. In the case of GetObject, we are exposing the request ID as part of the first chunk metadata. I was not really sure how to expose the HeadObject metadata; some considerations were such as having the HeadObject metadata as the first empty chunk. I exposed it in the object_metadata. The request ID in object_metadata will always be set and duplicated for GetObject.
What and how to expose object_metadata? I have tried to expose everything that made sense at the object level. Currently, I am using Tokio's OnceCell + oneshot channel to expose the object_metadata. It's not too bad, but I am not sure if we want to use Tokio-specific utilities. Please feel free to suggest if there are better options I can use or ways to simplify this.
I have added a TODO that we should abort the download if downloading a chunk fails, as I think it's not happening right now.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

aws-s3-transfer-manager/src/operation/download/chunk_meta.rs

graebm · 2024-11-11T22:22:18Z

aws-s3-transfer-manager/src/operation/download/handle.rs

 }

 impl DownloadHandle {
    /// Object metadata
-    pub fn object_meta(&self) -> &ObjectMetadata {
-        &self.object_meta
+    pub async fn object_meta(&self) -> Result<&ObjectMetadata, error::Error> {


my gut tells me we'd be better off with an API that mirrors what we're doing under the hood, and we can optionally wrap that with a "nice" API that hides certain details we don't think basic users need

So: Our API has an iterator, that iterates over each request/response in the operation, whether or not it has a body.

So if we did a HeadObject, the first item is HeadObject.

If Content-Length was 0, there will be no more items

Otherwise, there will be 1+ GetObject items afterwards.

If we do discovery via GetObject, then there will simply be 1+ GetObject items.

It is what it is

We can wrap that in a "nice" API where there are different getters for "Object Metadata" vs body chunks, and in that wrapper we can deal with the complexity of trying to differentiate the discovery request from the first-chunk request, and whether or not they're actually the same request.

Just imagine the headache our advanced users are going to have trying to "unwrap" the complexity we bake into the base API. If they're logging metrics from the stream of HTTP calls, they'd need different branches for discover vs body chunks, and they'd need to figure out whether the 1st chunk request was actually the same as the discovery request.

they'd need to figure out whether the 1st chunk request was actually the same as the discovery request.

With the current API, it's straightforward because the object metadata will contain request_id only if we perform a HeadObject and not a GetObject. The GetObject will always be delivered from the body iterator.

Our API has an iterator, that iterates over each request/response in the operation, whether or not it has a body.

I think the current design more closely mirrors what we are actually doing, which is a separate/specific discovery request followed by N parallel GetObject operations. I thought about the iterator-based approach but didn't implement it due to the following considerations:

I think it won't be what our users expect. They will expect N part-size chunks, and receiving a first chunk that is just a HeadObject is not that interesting.

In the future, we might just get rid of HeadObject and perform a 1-byte GetObject request to retrieve the metadata (to support presign cases, etc.). In that case, do we simply expose that metadata as it is? That would be confusing for customers.

What about the empty file use case? Will we deliver two requests with empty bodies? The first GetObject might fail; will we expose this failure?

An iterator seems nice and simple, but I think it will make hiding the complexity harder as we go along. Advanced users should have the ability to get the information they need, but we also need the ability to hide certain complexities from the users.

I think the approach of if we make any request solely for discovery, we will deliver its metadata in object_meta and we will deliver N GetObject requests with their metadata, is a nice compromise. It lets us hide the complexity of discovery, and we can change it as needed in the future. I'd like to hear what others think about this.

I can see both points but I'm inclined to not go with an iterator API like this (at least not publicly).

Just imagine the headache our advanced users are going to have trying to "unwrap" the complexity we bake into the base API. If they're logging metrics from the stream of HTTP calls, they'd need different branches for discover vs body chunks, and they'd need to figure out whether the 1st chunk request was actually the same as the discovery request.

This need not necessarily be true depending on how we expose metrics. I get the point you're trying to make though.

I think this is something we may need to solicit some feedback on FWIW and see what our advanced customers have in mind.

Yeah, might be a mistake confusing metadata and metrics/telemetry.

Metadata is for a customer that needs to know some detail about a request/response that we didn't know they were interested in (e.g. part checksums).

Metrics/telemetry would include stuff like failed requests that got retried. And the use case for this is customers needing to look under the hood to diagnose bugs and performance issues (e.g. awslabs/mountpoint-s3#1079).

Maybe we don't even want to call this metadata, it's all the fields that can be set/returned on an object that isn't the object content itself. In SDKs this is all just part of e.g. GetObjectResponse but here we have to give it a name because we aren't exposing it the same way.

Discussed offline. Currently, we will be moving forward with the current API. We will consider an iterator-like API for telemetry in the future, which will require us to hook into the SDK's interceptor so that we can get all the requests, including retries, etc., and expose that.

aajtodd

Good start. Still mulling some of this over but left some questions and suggestions.

aajtodd · 2024-11-12T14:05:41Z

aws-s3-transfer-manager/examples/cp.rs

@@ -171,14 +171,14 @@ async fn do_download(args: Args) -> Result<(), BoxError> {
    // TODO(aws-sdk-rust#1159) - rewrite this less naively,
    //      likely abstract this into performant utils for single file download. Higher level
    //      TM will handle it's own thread pool for filesystem work
-    let mut handle = tm.download().bucket(bucket).key(key).send().await?;
+    let mut handle = tm.download().bucket(bucket).key(key).send()?;


discuss: If we are changing send() to not do any work or be async then perhaps we should change the name (e.g. initiate). I only bring this up because we mirrored the Rust SDK API a bit here and wondering whether it will cause any confusion.

Thanks, updated. I have added a TODO to make it consistent across the board. I will create a follow-up PR to rename upload from send to initiate as well.

aajtodd · 2024-11-12T14:12:01Z

aws-s3-transfer-manager/src/operation/download/body.rs

+/// this data via [`impl Buf`](bytes::Buf) or it can be copied into contiguous storage with
+/// [`.into_bytes()`](crate::byte_stream::AggregatedBytes::into_bytes).
+#[derive(Debug, Clone)]
+pub struct AggregatedBytes(SegmentedBuf<Bytes>);


suggestion: Move to io module

Also this is maybe fine for now but is this how we want to expose the data/chunks? I originally did it via AggregatedBytes since that was basically the only option to do what we wanted with the API given from the SDK without introducing a new API (which we are now doing and should).

I don't have a suggestion at the moment but it bears consideration since we are effectively taking control over the API and that may give us different options than originally considered in the initial "naive" implementation.

Thanks, moved to io. AggregatedBytes seems fine to me for exposing the body. As discussed offline, we can batch up multiple ChunkResponse as a future optimization.

aajtodd · 2024-11-12T14:25:25Z

aws-s3-transfer-manager/src/operation/download/object_meta.rs

    pub object_lock_mode: Option<aws_sdk_s3::types::ObjectLockMode>,
    pub object_lock_retain_until_date: Option<::aws_smithy_types::DateTime>,
    pub object_lock_legal_hold_status: Option<aws_sdk_s3::types::ObjectLockLegalHoldStatus>,
+
+    /// The request_id if the client made a request to retrieve object metadata, such as with HeadObject.
+    pub request_id: Option<String>,


I think we're going to want to hide this and make it similar to the way the SDK exposes this for future evolution.

You can see the RFC here for how the SDK did this.

Example implementation and supporting traits:

S3 extended request ID: trait

Output hides details and exposes hides details and exposes via trait

We do something similar with errors (error metadata and implementation)

Also related is this todo and that we need to expose metadata for errors as well

Thanks, I have updated it to use the same trait. I have not looked into exposing metadata for errors yet. I will keep it as a TODO for now for a future PR, since I want to get this merged first to unblock other people.

aajtodd · 2024-11-12T14:26:16Z

aws-s3-transfer-manager/src/operation/download/object_meta.rs

            object_lock_retain_until_date: value.object_lock_retain_until_date,
-            object_lock_legal_hold_status: value.object_lock_legal_hold_status,
+            object_lock_legal_hold_status: value.object_lock_legal_hold_status.clone(),
+            request_id: None,


correctness: GetObjectOutput should/may have these

Thanks, I have added these as for GetObjectOutput as well.

aws-s3-transfer-manager/src/operation/download/chunk_meta.rs

aajtodd · 2024-11-12T14:43:25Z

aws-s3-transfer-manager/src/operation/download/body.rs

+    // TODO(aws-sdk-rust#1159, design) - consider PartialOrd for ChunkResponse and hiding `seq` as internal only detail
+    // the seq number
+    pub(crate) seq: u64,
+    /// data: body of the object


style: this is a public doc comment and will show up in the generated documentation. We should be mindful of this and provide properly formatted and useful docs. You can always check generated docs with cargo doc --no-deps --all-features --open

/// The content associated with this particular part/range request.

Thanks, updated.

aws-s3-transfer-manager/src/operation/download/handle.rs

aajtodd · 2024-11-12T14:53:38Z

aws-s3-transfer-manager/src/operation/download/chunk_meta.rs

+/// request metadata other than the body that will be set from `GetObject`
+// TODO: Document fields
+#[derive(Debug, Clone, Default)]
+pub struct ChunkMetadata {


question/discuss: Do we need two types here (ChunkMetadata and ObjectMetadata)? They are effectively the same thing just populated slightly differently right? Is there any value to us or the customer in having to wrap their head around two different types? i.e. another way to look at this would be ObjectMetadata and only fields that are available for a given request type are populated (since most fields are optional anyway).

We have discussed it a bit offline, but yeah, I think two types here make more sense with respect to future adaptability. From experience with the CRT S3 client, it's just a pain and confusing trying to map different response outputs to a single class. ObjectMetadata is something that we constructed with some manual decisions around what to expose and how. Some fields might be available but might not make sense at an object level, like checksums. It also keeps us more flexible if we want to add some metadata in the future at different levels, like maybe numChunks (different from partsCount that S3 provides) at an object level, etc.

aajtodd · 2024-11-12T14:57:52Z

aws-s3-transfer-manager/src/operation/download/handle.rs

 }

 impl DownloadHandle {
    /// Object metadata
-    pub fn object_meta(&self) -> &ObjectMetadata {
-        &self.object_meta
+    pub async fn object_meta(&self) -> Result<&ObjectMetadata, error::Error> {


I can see both points but I'm inclined to not go with an iterator API like this (at least not publicly).

Just imagine the headache our advanced users are going to have trying to "unwrap" the complexity we bake into the base API. If they're logging metrics from the stream of HTTP calls, they'd need different branches for discover vs body chunks, and they'd need to figure out whether the 1st chunk request was actually the same as the discovery request.

This need not necessarily be true depending on how we expose metrics. I get the point you're trying to make though.

I think this is something we may need to solicit some feedback on FWIW and see what our advanced customers have in mind.

aajtodd · 2024-11-12T15:04:32Z

aws-s3-transfer-manager/src/operation/download/body.rs

-    pub async fn next(&mut self) -> Option<Result<AggregatedBytes, crate::error::Error>> {
-        // TODO(aws-sdk-rust#1159, design) - do we want ChunkResponse (or similar) rather than AggregatedBytes? Would
-        //  make additional retries of an individual chunk/part more feasible (though theoretically already exhausted retries)
+    pub async fn next(&mut self) -> Option<Result<ChunkResponse, crate::error::Error>> {


One thing that comes to mind with this API is we only yield one chunk at a time when (now that we own the AggregatedBytes type) we could be yielding many chunks at once all stitched together into a single AggregatedBytes.

I realize the point of the change we're making is to expose metadata per/chunk. I'm wondering though if we shouldn't also consider a batch recv API or an API that combines multiple chunks into one and returns as many as we have sequenced available. There is a possibility this makes a difference depending on how fast/slow a customer is processing data. Would have to benchmark I suppose. We can wait of course but it's worth bringing up for discussion now to see what people think and add TODO's maybe if we agree.

That's pretty reasonable, IMO, something like collect in ByteStream.

Thanks, yeah, it makes sense, but we will need to benchmark it to see if there is any real-world performance gain. We should probably do it once we have a stricter global number-of-parts-in-memory scheduler in place because I think it will matter then. Currently, every download has its own queue, which is very large, so it might not matter much.

aajtodd

Fix and ship

aajtodd · 2024-11-20T19:03:16Z

aws-s3-transfer-manager/src/io/mod.rs

@@ -3,6 +3,8 @@
 * SPDX-License-Identifier: Apache-2.0
 */

+/// Download Body Type
+pub mod aggregated_bytes;


No need to make the module public, just export it:

mod aggregated_bytes; pub use aggregated_bytes::AggregatedBytes;

This makes the type visible to users via the path aws_s3_transfer_manager::io::AggregatedBytes

Thanks, updated.

aajtodd · 2024-11-20T19:07:12Z

aws-s3-transfer-manager/src/operation/download.rs

+
+    // make initial discovery about the object size, metadata, possibly first chunk
+    let mut discovery = discover_obj(&ctx, &input).await?;
+    let _ = object_meta_tx.send(discovery.object_meta);


why are we ignoring the result here? If we never expect this to fail it should be used with an expect so that if we ever do violate the assumptions we're making it fails loudly.

This will only fail if we have already dropped the handle at this point. The discovery task is a join handle and would be in detached state. While it’s possible for this to fail, we should not panic on it. We can handle the cancellation gracefully once we implement the cancellation logic for downloads. @ysaito1001, What are your thoughts since you are looking into cancellation?

Glad you asked. I'd say don't worry about using _ at line 121 in this PR, so no action needed. My next PR of cancelling download(s) should be a good place to handle the error properly.

aws-s3-transfer-manager/src/operation/download/builders.rs

aajtodd · 2024-11-20T19:27:50Z

aws-s3-transfer-manager/src/operation/download/handle.rs

+
+use crate::error::{self, ErrorKind};
+use tokio::{
+    sync::{oneshot::Receiver, Mutex, OnceCell},


tokio'sMutex is fairly heavyweight, unless we're holding a lock across an await point we should prefer stdlib Mutex.

Yes, we are holding the lock across an await point in the join function because we need to acquire the lock and join all the tasks.

aajtodd · 2024-11-20T19:30:10Z

aws-s3-transfer-manager/src/operation/download.rs

@@ -20,19 +20,25 @@ mod handle;
 pub use handle::DownloadHandle;
 use tracing::Instrument;

-mod object_meta;
+/// Provides metadata for each chunk during an object download.
+pub mod chunk_meta;


Again we probably don't want to expose all of the various modules and instead curate a public API that is more thought out (e.g. via pub use).

Our other operations define an output module with the output type(s) of the operation. Both chunk and object metadata are output/response types so perhaps that makes sense here for consistency?

I have updated it to pub use. Initially, I started with renaming body to output to keep it consistent but later reverted that as body made more sense than output. We can revisit this later.

ysaito1001

Looks good. We can consider adding an integration test for verifying object_meta on DownloadHandle in a future PR, if we don't have one.

ysaito1001 · 2024-11-20T22:30:43Z

aws-s3-transfer-manager/src/operation/download.rs

@@ -52,7 +58,7 @@ impl Download {
    /// "follow from" the current span, but be under their own root of the trace tree.
    /// Use this for `TransferManager.download().send()`, where the spawned tasks


Suggested change

/// Use this for `TransferManager.download().send()`, where the spawned tasks

/// Use this for `TransferManager.download().initiate()`, where the spawned tasks

Just for consistency, even if we rename .initiate in a future PR.

Thanks, I did update the docs for download but missed it here. I have updated it.

ysaito1001 · 2024-11-21T01:50:11Z

aws-s3-transfer-manager/src/operation/download/object_meta.rs

 }

-impl From<GetObjectOutput> for ObjectMetadata {
-    fn from(value: GetObjectOutput) -> Self {
+impl From<&GetObjectOutput> for ObjectMetadata {


Ah, because GetObjectOutput is not cloneable (due to its body of type ByteStream not cloneable), we need one of the conversions (in this case GetObjectOutput -> ObjectMetadata) to start from a reference in case we want to create both ObjectMetadata and ChunkMetadata from a single GetObjectOutput.

Might be worth adding comments (I believe this stems from we have two separate types ObjectMetadata and ChunkMetadata convertible from GetObjectOutput).

Yes, I have added comments here to explain this.

aws-s3-transfer-manager/src/operation/download.rs

ysaito1001 · 2024-11-21T05:43:28Z

aws-s3-transfer-manager/src/operation/download.rs

+        let (object_meta_tx, object_meta_rx) = oneshot::channel();
+
+        let tasks = Arc::new(Mutex::new(JoinSet::new()));
+        let discovery = tokio::spawn(send_discovery(


Pretty neat, delegating async-related preparatory steps (like acquiring permit) to a separate task so orchestrate can return early. tasks needs to be in Arc because of that, but feels like that's a necessary change.

waahm7 added 24 commits October 30, 2024 15:59

wip expose ChunkResponse

32740b0

expose metadata

636588a

wip ChunkResponse

5de8eaa

renames

0301f54

wip

4ee9731

wip refactor

f606607

sucks but works meta_data

c2ab91a

refactor chunk vs object meta_data

868a45c

no async send

c9c8552

No need for oncecell, I have mut reference

5636926

interior mutability ftw

24074d1

rename output back to body

e14e80b

chunk meta can't be constructed from HeadObject

e339417

fix todos

0de1530

unordered body is useful

e824e23

fix docs

abafe4e

nit

ea2ce9c

fix test

f1cc638

fix tests

db9bd42

cleanup

ee59055

fmt

49a681a

cleanup

b3bf149

fix meta_receiver bug possibility

d386e1d

fix naming

67f9b53

waahm7 commented Nov 11, 2024

View reviewed changes

aws-s3-transfer-manager/src/operation/download/chunk_meta.rs Outdated Show resolved Hide resolved

waahm7 added 2 commits November 11, 2024 10:54

fix comment

e92aab6

allow buf

e812a9d

waahm7 changed the title ~~WIP | download_object expose metadata~~ Expose Download Object Metadata Nov 11, 2024

waahm7 marked this pull request as ready for review November 11, 2024 19:11

waahm7 requested a review from a team as a code owner November 11, 2024 19:11

waahm7 changed the title ~~Expose Download Object Metadata for Every Chunk~~ Expose Download Metadata for Every Chunk Nov 11, 2024

graebm reviewed Nov 11, 2024

View reviewed changes

aajtodd reviewed Nov 12, 2024

View reviewed changes

waahm7 added 14 commits November 19, 2024 13:37

rename to initiate

e8da53c

Move aggregated bytes to io

8b19402

traits for request ids

6315920

rename total size to content length

43d1b43

Add docs for ObjectMetadata

aa22e50

Source docs from generated code

0fe2fac

add comments

b04c4d4

add todo

bb837fc

fix comments

6481cd9

fmt

947501a

update comment

60cbcdc

add manual debug implementation

ba32f03

rename ChunkResponse -> ChunkOutput

bb43ed2

better docs

dcd6013

aajtodd approved these changes Nov 20, 2024

View reviewed changes

ysaito1001 approved these changes Nov 21, 2024

View reviewed changes

waahm7 added 8 commits November 25, 2024 09:41

use pub use for aggregated bytes

6ec3a3f

update docs

bf99d44

Merge branch 'main' into download-metadata

eeaf1c3

use pub use instead of pub mod

49cd55d

fmt

566a50e

external types

06fc9cb

Add fixme

0246b9c

fmt

6376db1

waahm7 merged commit 06c087a into main Nov 25, 2024
14 checks passed

waahm7 deleted the download-metadata branch November 25, 2024 23:09

waahm7 mentioned this pull request Dec 19, 2024

Make createMPU Call Async #84

Merged

		@@ -52,7 +58,7 @@ impl Download {
		/// "follow from" the current span, but be under their own root of the trace tree.
		/// Use this for `TransferManager.download().send()`, where the spawned tasks

Expose Download Metadata for Every Chunk #73

Expose Download Metadata for Every Chunk #73

Conversation

waahm7 commented Nov 10, 2024 • edited Loading

graebm Nov 11, 2024 • edited Loading

Choose a reason for hiding this comment

waahm7 Nov 12, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aajtodd left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

waahm7 Nov 19, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aajtodd left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ysaito1001 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

waahm7 Nov 25, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

waahm7 commented Nov 10, 2024 •

edited

Loading

graebm Nov 11, 2024 •

edited

Loading

waahm7 Nov 12, 2024 •

edited

Loading

waahm7 Nov 19, 2024 •

edited

Loading

waahm7 Nov 25, 2024 •

edited

Loading