Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encoding recommendation #246

Open
martindurant opened this issue Oct 15, 2024 · 2 comments
Open

Encoding recommendation #246

martindurant opened this issue Oct 15, 2024 · 2 comments

Comments

@martindurant
Copy link

martindurant commented Oct 15, 2024

I don't know if the spec here has a place for this, but I wonder if there are any opinions on the storage encoding for geo data.

I would expect the following to perform the best, but I have not done testing. It would be interesting if this repo could make some interesting measures of what decisions make the biggest difference.

  • FLOAT64 data is probably needed to preserve accuracy on lat/lon data
  • BYTE_STREAM_SPLIT will make this much more compressible, especially for the case that the data in a page has similar values
  • v2 pages, to separate the poorly compressible but compact repetition values from the compressed data
  • Zstd is probably the best compressor in terms of space/CPU tradeoff

For the case that somewhat lossy compression is allowable:

( @kylebarron suggested this might be a good discussion to have)

@kylebarron
Copy link
Collaborator

To clarify I assume you're referring to the native encodings, not the serialized WKB encoding, which is just a binary column?

@martindurant
Copy link
Author

Correct. I don't think there will be too many options for WKT/WKB columns, and any compressor should pick up interior repetitions in that case. I suppose DELTA_LEN encoding could be tried, but my hunch is, that it won't matter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants