-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Large files corrupted after sync by desktop clients #114
Comments
@zenhack Is this a sandstorm-io/sandstorm#2464 related issue or is it likely Davros-specific? |
I poked around a bit, and I think that in theory nextcloud desktop should handle not supporting range requests OK: That said it's entirely possible that that logic isn't super battle tested, or that I've misunderstood something. |
FWIW I don't know that owncloud/nextcloud clients typically use Range requests; from what I remember they upload hundreds of smaller files and then send a command to finalize and concatenate them together. I tested davros itself (sans-sandstorm) by manually hashing a particular 2GB file before and after upload and wasn't able to reproduce, unfortunately. I'll take another look at that concatenation code and see if I can reproduce it in sandstorm. |
I was able to reproduce on sandstorm, but I don't think it's a sandstorm problem. I think I see the likely cause of this in reading the above nextcloud docs, which unfortunately was undocumented at the time I wrote the uploading chunking. Originally, ownCloud client would pick a chunk size prior to starting the upload, and upload chunks in order. While ownCloud's own implementation saved the chunks to a temporary directory and then concatenated them together, mine aimed to be a bit more efficient: I had just one tempfile, and My guess is that the client has started adaptively changing the chunk size, which will cause the calculation (chunk size + chunk number) to be wrong. The fact that the corrupted file has the right filesize supports this. |
Adds some additional ownCloud/NextCloud client headers relating to mnutt/davros#114. The new headers: * `oc-checksum` - a hash (typically `SHA1`) of the entire file. After all chunks have been uploaded, davros should be checking this to ensure that it uploaded correctly, otherwise we may end up saving a corrupted file. [this header has been around for a while, I should have been doing this all along] * `oc-chunk-offset` - part of a new chunking scheme. Right now the legacy chunking scheme relies on all chunks being the same size in order to calculate the offset (number * chunkSize); this header would allow the client to vary the chunk size based on timing/bandwidth. * `oc-lazyops` - I'm unlikely to implement this soon, but it would be a way for the client to signal "I don't want to wait for a response". The server would then hypothetically respond with a location where the client could poll for progress updates.
Once sandstorm allows a few extra request headers through I’ll be able to checksum the final result and ensure that at least no data is ever corrupted. But it wouldn’t actually solve the underlying issue which I think may be pieces coming in out of order. The easiest way to solve that is to opt into the newer NextCloud/OwnCloud chunking mechanism that passes an |
@rpezzi I believe this should be resolved by the experimental Davros version 0.31.1 in the experimental market: https://apps.sandstorm.io/app/8aspz4sfjnp8u89000mh2v1xrdyx97ytn8hq71mdzv4p4d8n0n3h?experimental=true If you get a chance to test this version in a way that doesn't put your production data at risk, that would be really helpful! |
I noticed an issue with large file corruption that I wanted to report. I'm putting it here in case it's related, but I'd be happy to make a new issue. I uploaded a ~1GB zip file successfully. I went as far as downloading it and comparing the md5sum with what I'd uploaded. But then I give it to my friend. The md5sum didn't match for him on his first try. He tried downloading again and it matched. I'll give his full report:
|
Was the download that did not complete come from the web interface or a sync client? I'm wondering if the grain managed to terminate during the download somehow. |
Oh right, to be clear this was all web. |
Hi.
On my setup, large files uploaded by desktop clients gets corrupted on the Davros app (observed this for files of 500MB and 1GB).
Soon after a large file is uploaded to Davros by a desktop client, a corrupted version of the file is downloaded back to the PC by the client, overwriting the original file. I reproduced this behavior with Davros on both, a private sandstorm instance and on a demo instance at alpha.sandstorm.io. Same problem for NextCloud desktop client 2.5.1 (official Debian repository) & the latest Owncloud client (2.6.3). Clients running on Debian 9. This was present on Davros 0.26.1 and persists on 0.27.3.
No problem if the file is uploaded though the Sandstorm web interface.
By the way, thanks for giving more life to Davros.
The text was updated successfully, but these errors were encountered: