Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fastPut generates corrupted zip files #134

Open
dvlato opened this issue May 1, 2019 · 11 comments
Open

fastPut generates corrupted zip files #134

dvlato opened this issue May 1, 2019 · 11 comments

Comments

@dvlato
Copy link

dvlato commented May 1, 2019

When we upload a ZIP file (size 37430637 bytes) using fastPut with default options (I am using the ssh2-sftp-client package, so I am just doing sftp.put(origin,target).then(sftp.end())), the uploaded file is corrupted.

In my tests most of the times (probably always if I use nodeJs 8) the file has a different size from the original file (I've also reproduced the file size difference with this smaller file, but not consistently with the attached file).
images.zip

I would say that with Node10 the file size is correct with the default options but if I change the options to {concurrency: 32, chunkSize: 8192} , the file size is 37422445 instead of 37430637 (the difference is equals to chunkSize!!).

However even when using NodeJS 10 with default options, the file contents are not the same (even if the file size seems to match)" this is the error that 'cmp' shows:

differ: char 655361, line 2098

I can reproduce this issue with different operating systems and computers.

@dvlato
Copy link
Author

dvlato commented May 1, 2019

I understand the file gets corrupted because it's a ZIP consisting of thousands of very small files, so the zip structure can be damaged easily...

@mscdex
Copy link
Owner

mscdex commented May 1, 2019

Which version of ssh2-streams is being used here (npm ls should tell you)? I just recently published a version that should have fixed this.

@dvlato
Copy link
Author

dvlato commented May 1, 2019 via email

@mscdex
Copy link
Owner

mscdex commented May 2, 2019

I can't reproduce the issue with latest ssh2/ssh2-streams with the configurations you mentioned and with a file of the same exact size. The sha1sums always match on both sides.

@dvlato
Copy link
Author

dvlato commented May 2, 2019

Hi thanks a lot for the quick response! I have checked, and it seems I only get these corrupt files (when using fastPut, put works fine but it's super slow - 2 minutes compared to a few seconds with lftp ) when connecting to that specific sftp server (Akamai's netstorage). The banner I get is this one: SSH-2.0-Server-VIII-hpn14v11
Do you have any idea of how I should troubleshoot this issue? Is there any special feature / SFTP version needed?

@mscdex
Copy link
Owner

mscdex commented May 3, 2019

I don't know what their underlying platform is, so the best I'm able to do is use hpn14v11 with OpenSSH v7.3 (the only OpenSSH version that particular hpn version was designed for). Again, every upload I try in that scenario as well comes out the same, the sha1sums match on both sides.

Are you saying it only happens with this particular server and if you use fastPut() to upload the file to another server, the file contents match exactly?

As far as debugging there is nothing special, just debug fastXfer() however you like, with an IDE or inserting debug statements.

@mscdex
Copy link
Owner

mscdex commented May 3, 2019

Also, as far as transfer speed goes, ssh2 isn't usually dramatically far off in comparison to standard OpenSSH sftp transfer using the same cipher/mac selection. For ssh2 you can get much faster transfer speeds if you use an aes-gcm cipher on a machine that has aes acceleration as it's more efficient than having cipher and mac separate, so you might consider promoting those ciphers by way of an explicit algorithms option in your connection config object if you have a CPU that has that (most modern x86 processors have this).

If you're using an OpenSSH client that has the hpn patch(es) and the server also has hpn patch(es), then there are special behavioral changes (protocol-wise) that allow for better performance, so that has some impact.

@dvlato
Copy link
Author

dvlato commented May 3, 2019

Thank you for your quick and informative responses. That's what I meant, I've tested the code against two other sftp servers and for those the md5sum matches correctly with the same file. I was wondering (I haven't looked into the implementation) what part of the fastPut could be problematic for our server... for instance, it might require some 'special' feature that's not required for put() .

Also, as mentioned, the speed is several orders of magnitude slower with put() than if I use openSSH directly or filezilla. I'm on Mac Os X and I haven't installed any hpn patches (or see it in the output of my client) so I don't think that should be it. From your response I see that's clearly not normal so I will try to change the algorithm and see if that helps.

@mscdex
Copy link
Owner

mscdex commented May 3, 2019

You could also set debug: console.log and compare the resulting output between one of the servers that transfers ok and the problematic server and see if there are any obvious differences that might explain things.

You might also try using fs.createReadStream() and sftp.createWriteStream() and piping the two together. This should be equivalent to having concurrency: 1 (which you could also try) for fastPut() and might help rule out any read buffer reuse issues. If all of that checks out then you'd probably have to dig into the fastXfer() code as described previously.

@dvlato dvlato closed this as completed May 7, 2019
@dvlato dvlato reopened this May 7, 2019
@dvlato
Copy link
Author

dvlato commented May 7, 2019

Hi, sorry to reopen this issue.
When you say " ssh2 isn't usually dramatically far off in comparison to standard OpenSSH sftp transfer using the same cipher/mac selection.", are you talking about the 'fast transfer' or standard one?
For standard put (or fastPut with concurrency: 1), when I set the concurrency to 1 and the algorithms to "aes128-ctr" , "hmac-md5" (I haven't set the buffer size or any other options), my file with size "37430637" takes nearly 3 minutes instead of some seconds, even with one of the 'correct' servers.

Is that something you would expect or should I check my code? I don't know what might make it so slow...

@dvlato dvlato closed this as completed May 7, 2019
@dvlato dvlato reopened this May 22, 2019
@dvlato
Copy link
Author

dvlato commented Jun 11, 2019

Follow up for anyone having the same issue: we have found out that the issue was caused by the server side. The transfer was correct but saving to disk is not (and they are not going to fix it as they don't support 'in-place updates' such as seek and write).

However, we still find ssh2-streams extremely slow compared to OpenSSH sftp transfer, so we have stopped using it. I would love to revisit when the performance issue is solved or if it turns out it's a bad usage (a sample that showed comparable performance to OpenSSH would be welcome).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants