-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JSON->CBOR using decimal fractions? #33
Comments
Interesting. First of all you would need a JSON parser that interprets the numbers as decimal, not as binary64 as is usual in the JavaScript (but not necessarily JSON) world. Second, you would need to have a CBOR implementation that preserves decimal numbers. I just noticed that cbor-ruby doesn't do that, but that would be easy to add. Where do you get these short decimal numbers from? (If they are really measurements, you could also convert them to, say 16-bit floating point.) |
The real numbers are a bit bigger than what I wrote, typically 3-5 digits. Each column in the big table would be a separate array. The Protein Data Bank has 100,000+ of such files, total gzipped size is > 30GB. |
Bumped into this ticket a little randomly while looking into CBOR for other reasons and thought it was interesting. I am not sure whether a conversion from JSON is a good way to go, as by default all JSON parsers I know will use floating numbers. But if you were to parse the linked cif file directly you have several options. Taking a random triplet 8.169, 57.419, 85.998 (atom 3211):
I am not an expert on gzip, but the second decimal fractions option may compress favourably as well because it repeats a number of bytes often: |
I have long arrays of short numbers in JSON. To give a small example:
cbor.me converts it to 37 bytes (more than in the input!):
I think that with decimal fractions the result would be more concise.
Is there any ready-to-use converter that automatically uses decimal fractions where it makes sense?
The text was updated successfully, but these errors were encountered: