Idea: Support zstd computed dictionaries #323
Replies: 1 comment
-
(moving to discussions) It's a great application of zstd dictionaries, thanks for adding the script and results. Right now the killer feature of PMTiles is being able to decode tiles on the browser, which is why we need to use gzip - gzip has JS implementations like fflate we can use as polyfills if browsers don't all have the new If you control both the source archive and the client you could implement this already by storing the zstd dictionary base64-encoded in the PMTiles JSON metadata section, then setting the header |
Beta Was this translation helpful? Give feedback.
-
PMTiles already supports regular zstd as one of the compression algorithms. However, zstd also supports custom dictionaries, which can help factor out common data when there are a lot of small files, like in a PMTiles file. From their README:
https://github.com/facebook/zstd#the-case-for-small-data-compression
I tested this using a set of ~2,000 random z12-14 tiles. The tile schema is OpenMapTiles, with a few additional tags that shouldn't significantly affect the results. I was able to achieve an 13% improvement with a dictionary of only 8kb. The dictionary ends up containing mostly layer names, tag names, and common tag values.
Table of compression ratios for each dictionary size
Python script
This would require changes to the PMTiles spec, of course. There could be additional header fields that give the offset and length of the custom dictionary, which would be used to decompress tiles. It would be up to the software producing the file to use an appropriate dictionary (it could take one as a command line argument, for example, or train one on a subset of its output).
Beta Was this translation helpful? Give feedback.
All reactions