-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ByteBuffer or InputStream support for VCDiffEncoder Dictionary #6
Comments
A Are you just looking for better initialization performance? The dictionary needs to be loaded into memory as soon as the encoder is created/used, so the only benefit to a |
If I understood correctly, for encoding we need to have whole of dictionary content into memory. If yes, I think for large files and high traffic applications this could cause out of memory issues. I was wondering is there any way we can provide dictionary in chunks similar to |
More or less (ignoring memory-mapped files and swapping). The next chunk of data could reference any part of the dictionary, and you'd have to check it.
Both encoding and decoding can be done on data chunks because the chunk can be compressed looking at the dictionary (or previous output), and the output written in a chunk. This doesn't work for dictionaries because any part of the dictionary can be referenced during encoding and decoding.
You can share the same dictionary Adding support for a |
VCDiffEncoder can be created using VCDiffEncoderBuilder. Here, for source content we need to specify byte[] as input to withDictionary().
Source content can be as large as > 1GB. I think for better performance for large files Dictionary can be accepted as either ByteBuffer or InputStream.
I think this is the most common use case while using this library for large files.
Will you please consider this change in your next release ?
Thanks.
The text was updated successfully, but these errors were encountered: