-
I am trying to build a system for copying large files between cloud storage. Specifically, from an S3 bucket and to locations in either S3, Azure, or GCS. Some of the files are too large to copy in a Lambda function (they time out). I'm thinking about trying to parallelize reading and writing by reading/writing individual parts in each Lambda invocation instead of a whole file. Then I could use a Step Function state machine to orchestrate the multipart download/upload. From reading the docs for SmartOpen, I'm not sure if or how I can accomplish this using this library. I know SmartOpen uses multipart operations under the hood, but does it support reading/writing individual parts as separate operations? In other words, is there a way I can download a specific single part from S3 with a previously initiated multipart download operation and then upload a singe part to S3/GCS/Azure with a previously initiated multipart upload operation? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
smart_open wasn't designed to solve your problem. You'd have to write your own code to handle that particular use case. |
Beta Was this translation helpful? Give feedback.
smart_open wasn't designed to solve your problem. You'd have to write your own code to handle that particular use case.