Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GZIP_2 compression of a flat-sky map seems lossy in some cases #135

Closed
Wei-Q opened this issue Jan 20, 2024 · 4 comments · Fixed by #138
Closed

GZIP_2 compression of a flat-sky map seems lossy in some cases #135

Wei-Q opened this issue Jan 20, 2024 · 4 comments · Fixed by #138
Assignees

Comments

@Wei-Q
Copy link

Wei-Q commented Jan 20, 2024

When we save a flat-sky map in a FITS file via maps.fitsio.save_skymap_fits, the default option for the argument compress is 'GZIP_2'. This compression scheme is supposed to be lossless, but that does not seem to be the case when I tested the compression scheme on an SPT-3G winter field coadd.

Here is an example:

In[4]: frame = core.G3File(path_to_a_coadd).next()


# Default save

In [5]: maps.fitsio.save_skymap_fits(output_path, frame["T"], Q=frame["Q"], U=frame["U"])

In [6]: frame_debug = maps.fitsio.load_skymap_fits(output_path)

In [7]: numpy.allclose(frame["T"], frame_debug["T"])
Out[7]: False


# No-compression save

In [18]: maps.fitsio.save_skymap_fits(output_path, frame["T"], Q=frame["Q"], U=frame["U"], compress=False, overwrite=True)

In [19]: frame_debug = maps.fitsio.load_skymap_fits(output_path)

In [20]: numpy.allclose(frame["T"], frame_debug["T"])
Out[20]: True
@arahlin
Copy link
Member

arahlin commented Jan 24, 2024

I think this isn't surprising, since floating-point data are difficult to compress without loss. See for example, the astropy fits documentation:

Floating point FITS images (which have BITPIX = -32 or -64) usually contain too much ‘noise’ in the least significant bits of the mantissa of the pixel values to be effectively compressed with any lossless algorithm. Consequently, floating point images are first quantized into scaled integer pixel values (and thus throwing away much of the noise) before being compressed with the specified algorithm (either GZIP, RICE, or HCOMPRESS). This technique produces much higher compression factors than simply using the GZIP utility to externally compress the whole FITS file, but it also means that the original floating point value pixel values are not exactly preserved. When done properly, this integer scaling technique will only discard the insignificant noise while still preserving all the real information in the image. The amount of precision that is retained in the pixel values is controlled by the quantize_level parameter. Larger values will result in compressed images whose pixels more closely match the floating point pixel values, but at the same time the amount of compression that is achieved will be reduced. Users should experiment with different values for this parameter to determine the optimal value that preserves all the useful information in the image, without needlessly preserving all the ‘noise’ which will hurt the compression efficiency.

One solution would be to expose the quantize_level compression option to allow the user to manually adjust the compression algorithm.

@Wei-Q
Copy link
Author

Wei-Q commented Jan 24, 2024

Thank you, Sasha, for looking into this. Adding new options sounds good to me, or perhaps we can just edit some of the things said in the docstring unless most people want to save compressed maps. For example, the docstring says FlatSkyMaps are optionally compressed, but the default value of compress isn't False. I think it may be good to change the default to False unless most people want to save compressed maps. The docstring also says GZIP_1 and GZIP_2 are lossless, but it sounds like that's actually not true, then (unless a FlatSkyMap to be saved has only integer values, I guess).

@arahlin
Copy link
Member

arahlin commented Jan 24, 2024

Thank you, Sasha, for looking into this. Adding new options sounds good to me, or perhaps we can just edit some of the things said in the docstring unless most people want to save compressed maps. For example, the docstring says FlatSkyMaps are optionally compressed, but the default value of compress isn't False. I think it may be good to change the default to False unless most people want to save compressed maps. The docstring also says GZIP_1 and GZIP_2 are lossless, but it sounds like that's actually not true, then (unless a FlatSkyMap to be saved has only integer values, I guess).

Yeah that's a good point, definitely should update the documentation. @menanteau I believe you changed the default compression for save_skymap_fits initially. Would you be ok with setting the default compression to False, given the above discussion?

@menanteau
Copy link
Contributor

Thanks @arahlin for keeping me in the loop. I do not remembering changing the default compression in save_skymap_fits. But in case it's fine to set the default compression to False. All of my calls to save_skymap_fits actually have set compress explicitly.

@arahlin arahlin self-assigned this Jan 24, 2024
arahlin added a commit that referenced this issue Jan 24, 2024
Compression of floating point data using the GZIP_2 fits compression algorithm
is *not* lossless.  To avoid pitfalls with this algorithm, compression is
disabled by default.  Also expose the `quantize_level` keyword argument to
enable adjustment of the compression algorithm, if compression is desired.

See the astropy FITS image documentation for further information:
https://docs.astropy.org/en/stable/io/fits/api/images.html

Closes #135.
arahlin added a commit that referenced this issue Jan 24, 2024
Compression of floating point data using the GZIP_2 fits compression algorithm
is *not* lossless.  To avoid pitfalls with this algorithm, compression is
disabled by default.  Also expose the `quantize_level` keyword argument to
enable adjustment of the compression algorithm, if compression is desired.

See the astropy FITS image documentation for further information:
https://docs.astropy.org/en/stable/io/fits/api/images.html

Closes #135.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants