fix: address masking issues identified in #44 #45

ceholden · 2024-12-06T20:57:44Z

Description

This PR is intended to address issues raised in #44, specifically,

We should be masking 0% reflectance as invalid (we had already masked <0) ("zero reflectance should be filtered out as well")
We should not mask >100% reflectance as these values are often "valid" but >100% because of unmet assumptions about topography (see, Current issues with VI #44 (comment))
We should use -19_999 as a nodata value instead of -9999 because -9999 inside the valid range of normalized differenced indexes (which when scaled ranged from [-10_000, 10_000]. This -19_999 was picked for consistency with the Landsat on-demand vegetation index products (e.g., Landsat NDVI)
Use the union of surface reflectance masks for all bands ("Pixel with negative or zero reflectance in any band should have no_data for all VIs")
We should avoid numerical computation issues when converting from float64 -> int16, specifically overflows that create a "wrap around" when downcasting.

How I did it

Change from masked_outside to masked_less_equal and set <= 0
Change from masked_outside to masked_less_equal, which no longer masks >100% reflectance
Define fill_value per Index enum member, set this fill value when calculating the index, and write the GeoTIFF with the fill value (-19_999)
Union the nodata masks by "or"-ing the nodata mask of all bands
See below,

For the issue with wraparound when downcasting, the example in test data was for EVI which can exceed [-10,000, 10,000] because it is not normalized. Consider this example from one of the granules in the test dataset,

# without clipping our large positive number becomes negative, which is misleading
>>> np.float64(192_583.3333).astype(np.int16)
-4025

# if we clip first the value is still a large positive number
>>> np.clip(np.float64(192_583.333), a_min=np.iinfo("int16").min, a_max=np.iinfo("int16").max).astype(np.int16)
32767

How you can test it

I added a specific unit test for (4), "mask bands by union of value range masks from all bands"
I added a test for all masking related requirements that I'm aware of and test each one as a separate parameterized test run
I updated the TIF files we keep for the "expected data" so existing unit tests shoudl work

…e of [-1, 1]

chuckwondo

This looks great! I really like the additional tests. I have only 1 completely ignorable comment that's perhaps a bit pedantic (not pydantic). Feel free to skip it.

hls_vi/generate_indices.py

chuckwondo

Fantastic!

ceholden added 9 commits December 6, 2024 12:32

Mask 0 reflectance but retain >100%

75f6bd4

Use -19,999 for fill value instead of -9999 to keep fill value outsid…

56de40e

…e of [-1, 1]

Add and test func to combine masks

322734a

Apply union of nodata masks

2d643b3

Clip to int16 min/max to avoid wraparound when downcasting from float64

e056a38

Ensure index func sets correct fill value

096158c

Rerender expected test data per changes

77f367c

lint

9251992

Fix for py3.6

2758ca6

ceholden marked this pull request as ready for review December 6, 2024 21:48

ceholden requested review from chuckwondo, sharkinsspatial and junchangju December 6, 2024 21:48

ceholden changed the title ~~fix: address issues identified in #44~~ fix: address masking issues identified in #44 Dec 6, 2024

ceholden added 5 commits December 6, 2024 18:11

Add unit test for specific masking business logic

cd12baf

less scale/unscale

828eaf9

run black :doh:

dc7ea83

py36

8950db6

no gdal autodetect in container

6df930e

chuckwondo approved these changes Dec 10, 2024

View reviewed changes

hls_vi/generate_indices.py Outdated Show resolved Hide resolved

Improve mask reduction (h/t Chuck 🎉)

ad12a50

chuckwondo approved these changes Dec 16, 2024

View reviewed changes

ceholden merged commit 2856231 into main Dec 17, 2024
1 check passed

ceholden deleted the ceh/issue44-vi-fix branch December 17, 2024 18:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: address masking issues identified in #44 #45

fix: address masking issues identified in #44 #45

ceholden commented Dec 6, 2024 •

edited

Loading

chuckwondo left a comment

chuckwondo left a comment

fix: address masking issues identified in #44 #45

fix: address masking issues identified in #44 #45

Conversation

ceholden commented Dec 6, 2024 • edited Loading

Description

How I did it

How you can test it

chuckwondo left a comment

Choose a reason for hiding this comment

chuckwondo left a comment

Choose a reason for hiding this comment

ceholden commented Dec 6, 2024 •

edited

Loading