-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix S3 filenames that include a hash #121
base: master
Are you sure you want to change the base?
Conversation
See #110 |
Oh gosh, sorry, I didn't see that PR. I don't mind if you want to close this, but I'd argue that while AWS doesn't recommend hashes, they do support it. Plus at least two people have run into this in the wild and had the motivation to open a PR to fix it, so it's clearly a real-world issue. Did you ever look into using the AWS parser? If you did, then we could at least bug AWS support instead of you 😆 |
You make a good point. I'm willing to reconsider this. Did you see how Laravel's storage facade handles it? It might be at least worth making sure we're taking the same approach. |
Digging into how Laravel handles it, they use the Flysystem library, and Flysystem stores the S3 bucket and path separately, sidestepping the issue of deconstructing paths. Looking at the ZipStream code again, I realize |
Hmm. Keep in mind there are two loops over our entire queue of files: first to calculate the sizes and send out a |
Sorry for not reporting back earlier, but I've been busy. But I have now investigated using a stream and made an interesting discovery: the AWS stream wrapper already gets the file size every time a new read stream is created, so using the stream to get the filesize is essentially free! I setup a benchmark on my local that creates 1,000 small 100 character random text files on S3, then zips them to a local file and times it. Running it a couple times gets: Using
|
Ran into this when we switched to using Zip Stream recently, S3 filenames with a hash
#
were being truncated. The LaravelStorage
facade handles hashes without issue.Possible breaking changes I can think of with this PR:
strlen("s3://{$this->getBucket()}/");
means that if anyone has a malformed S3 bucket string, this would fail in unexpected waysparse_url()
, this does change how S3 file paths are parsed