Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Value too large for defined data type" when dictionary file is > 4GB #16

Open
Steelskin opened this issue Sep 19, 2014 · 1 comment
Open

Comments

@Steelskin
Copy link
Contributor

Original issue 16 created by jeff.gustafson on 2008-10-09T22:13:01.000Z:

What steps will reproduce the problem?

  1. File larger than 4GB
    2.
    3.

What is the expected output? What do you see instead?
normal operation

What version of the product are you using? On what operating system?
0.2

Please provide any additional information below.

@Steelskin
Copy link
Contributor Author

Old comments:

open-vcdiff currently uses 32-bit integers to represent addresses and offsets. That representation causes the file size limitation you encountered. The solution will be to convert to using 64-bit integers uniformly. I will look into doing so in a future version of open-vcdiff.

Reply:

To expand upon my earlier comment, open-vcdiff was not originally designed with very large input sets in mind, but rather as a tool for implementing the SDCH protocol for moderately-sized HTTP responses. In that context, 2GB was seen as an ample limit.

I would like to make open-vcdiff as useful as possible for as many people as possible. Generalizing it to handle very large input and output files (so that it can be applied, for example, to revision control of huge text files) will be a good step towards that goal.

Side note: the use of std::string objects for internal storage restricts data sizes to std::string::max_size(), independently of whether 32-bit or 64-bit integers are used for addresses and offsets.

Then, more requests for that feature.
Increasing priority, this would be handy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant