re-organize crate structure part one #36

aajtodd · 2024-07-24T14:18:48Z

Issue #, if available:
#35

Description of changes:

In order to keep these PRs somewhat reasonable I'm going to stop here but we aren't done. This is the first of likely 2-3 PRs to get to the desired state.

NOTE: This PR does not change any implementation details of upload/download. It is largely re-organization. The fluent builder boilerplate adds nearly 1K lines.

rename request/response types to "input" / "output" (as suggested).
Remove Uploader and move the internals as the implementation details of the new Upload operation.
Introduce the transfer manager Client that transfer operations will be initiated from. Currently only upload is mostly fleshed out

Future PRs

Remove Downloader and move it's internals into the Download operation.
Re-think where the streaming types live (likely with the individual operations rather than io module but up for debate).

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

aws-s3-transfer-manager/src/client.rs

Velfi · 2024-07-24T14:53:59Z

aws-s3-transfer-manager/src/config.rs

+        &self.target_part_size
+    }
+
+    // TODO(design) - should we separate upload/download part size and concurrency settings?


I think not. I can see them each having different defaults but I'd be surprised if they had exclusive settings.

Download doesn't have a minimum part size of 5 MiB though. We can do downloads in smaller ranges (if of course it is beneficial to do so).

aws-s3-transfer-manager/src/operation/upload.rs

Velfi · 2024-07-24T14:56:28Z

aws-s3-transfer-manager/src/operation/upload.rs

+        let ctx = new_context(handle, input);
+        let mut handle = UploadHandle::new(ctx);
+
+        // MPU has max of 10K parts which requires us to know the upper bound on the content length (today anyway)


A comment implicitly refers to the current state of the code.

Suggested change

// MPU has max of 10K parts which requires us to know the upper bound on the content length (today anyway)

// MPU has max of 10K parts which requires us to know the upper bound on the content length

the "today anyway" refers to the current state of S3. It might very well be different in a few years if S3 bumps up its max object size

ysaito1001

LGTM. FYI, src/io contains the old style of mod.rs whereas the rest of the repo doesn't follow that style. We probably want src/io/mod.rs to follow the rest of the repo, but that cleanup doesn't have to be done in this PR.

aajtodd · 2024-07-24T15:24:19Z

LGTM. FYI, src/io contains the old style of mod.rs whereas the rest of the repo doesn't follow that style. We probably want src/io/mod.rs to follow the rest of the repo, but that cleanup doesn't have to be done in this PR.

I don't know that I agree. We use mod.rs in the runtime in places (e.g. here). Mostly though if a module file only contains other mod declarations and re-exports then not having the correspondingly named module file in the parent directory can help reduce clutter and cognitive load IMO.

e.g.

src/
  foo/
       mod.rs
       bar.rs
       baz.rs
  quux.rs

// foo/mod.rs contents
pub mod bar;
mod baz;
pub use baz::Baz;

vs

src/
    foo.rs
    quux.rs
    foo/
        bar.rs
        baz.rs

// src/foo.rs contents

pub mod bar;
mod baz;
pub use baz::Baz;

I think each has their place personally. If others feel strongly on favoring only one style though it's not a hill I want to die on.

ysaito1001 · 2024-07-24T15:56:19Z

I don't know that I agree. We use mod.rs in the runtime in places (e.g. here).

FYI, that is a leftover before we had switched to without-mod-rs world in smithy-rs. And modules without mod.rs were something I was told by peers when I joined so I'm simply passing that info here for consistency. That said, the community has both styles and I don't mean to govern the style in this repo, so feel free to take your pick.

graebm

I left some rambling comments, but I see now how this mirrors the organization of the SDK, so like ... I get it

graebm · 2024-07-24T18:01:09Z

aws-s3-transfer-manager/examples/cp.rs

+        .client(s3_client)
        .build();

+    let tm = aws_s3_transfer_manager::Client::new(tm_config);


Will it be confusing to call it aws_s3_transfer_manager::Client but also the aws_sdk_s3::Client is in the mix? User code is going to end up with 2 "S3 clients". Should it be aws_s3_transfer_manager::TransferManager instead?

Or is that just normal? Everything is a "::Client", it's namespaced by the module name, and users are used to that, and give good disambiguating variable names when they both appear side-by-side

I'm going to leave it for now but feel free to add it to the bikeshed issue. I can go either way, I do think it's redundant a bit when the crate/module name has transfer manager in it, though the SEP says it should be named TransferManager so maybe we call it that. Users can give an alias to an import to disambiguate e.g. use aws_s3_transfer_manager::Client as TransferManager. Also my hope is that the need for users to instantiate their own aws_sdk_s3::Client is rare when we add some convenience "loader" functions.

graebm · 2024-07-24T18:26:45Z

aws-s3-transfer-manager/src/client.rs

+    /// }
+    ///
+    /// ```
+    pub fn upload(&self) -> crate::operation::upload::builders::UploadFluentBuilder {


I was going to say it seems it weird that the returned type has such a long module path, and wondered if we want to shorten it at least for public consumption? But I see that this mirrors the path to aws_sdk_s3's fluent builders, so it's probably good

I'm seeing now how the module organization here really mirrors the SDK's module organization, which seems good to do.

no need to respond. just sharing my mental journey

It makes no difference to the generated docs or how users consume it FWIW. The end result to a user is the same as:

use crate::operation::upload::builders::UploadFluentBuilder; ... pub fn upload(&self) -> UploadFluentBuilder

graebm · 2024-07-24T18:45:28Z

aws-s3-transfer-manager/src/operation/upload/context.rs

    /// the multipart upload ID
    pub(crate) upload_id: Option<String>,
    /// the original request (NOTE: the body will have been taken for processing, only the other fields remain)
-    pub(crate) request: Arc<UploadRequest>,
+    pub(crate) request: Arc<UploadInput>,


Suggested change

pub(crate) request: Arc<UploadInput>,

pub(crate) input: Arc<UploadInput>,

I'll take another pass at cleaning up request/response naming in following PR(s). Good catch.

aajtodd added 4 commits July 23, 2024 13:18

re-org crate structure part one

636b5c1

wip relocate upload

c86d15f

introduce TM client config and start moving upload operation internals

46850f5

fill in fluent builder

216da8e

aajtodd requested a review from a team as a code owner July 24, 2024 14:18

Velfi approved these changes Jul 24, 2024

View reviewed changes

ysaito1001 approved these changes Jul 24, 2024

View reviewed changes

feedback

6886b93

graebm approved these changes Jul 24, 2024

View reviewed changes

aajtodd merged commit 4b4496d into main Jul 24, 2024
13 of 14 checks passed

aajtodd deleted the atodd/crate-org branch July 24, 2024 18:56

aajtodd mentioned this pull request Jul 25, 2024

re-organize downloader into download operation #37

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

re-organize crate structure part one #36

re-organize crate structure part one #36

aajtodd commented Jul 24, 2024

Velfi Jul 24, 2024

aajtodd Jul 24, 2024

Velfi Jul 24, 2024

graebm Jul 24, 2024

ysaito1001 left a comment

aajtodd commented Jul 24, 2024

ysaito1001 commented Jul 24, 2024

graebm left a comment

graebm Jul 24, 2024

aajtodd Jul 24, 2024

graebm Jul 24, 2024

aajtodd Jul 24, 2024

graebm Jul 24, 2024

aajtodd Jul 24, 2024

	// MPU has max of 10K parts which requires us to know the upper bound on the content length (today anyway)
	// MPU has max of 10K parts which requires us to know the upper bound on the content length

	pub(crate) request: Arc<UploadInput>,
	pub(crate) input: Arc<UploadInput>,

re-organize crate structure part one #36

re-organize crate structure part one #36

Conversation

aajtodd commented Jul 24, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ysaito1001 left a comment

Choose a reason for hiding this comment

aajtodd commented Jul 24, 2024

ysaito1001 commented Jul 24, 2024

graebm left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment