Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make caching warn if some of the args are unhashable #2585

Merged
merged 7 commits into from
Nov 15, 2023

Conversation

mraspaud
Copy link
Member

@mraspaud mraspaud commented Oct 4, 2023

This PR make the caching decorator fail if one of the arguments in unhashable where it was silently ignoring it previously and not caching.

  • Tests added

@mraspaud mraspaud added the bug label Oct 4, 2023
@mraspaud mraspaud self-assigned this Oct 4, 2023
@mraspaud mraspaud requested a review from djhoese as a code owner October 4, 2023 07:17
Copy link
Member

@djhoese djhoese left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works for me. I'm not sure why I did it this way originally. Maybe I had hoped that we could still cache based on all the other inputs...but that doesn't really make sense.

Feel free to merge once tests pass and we'll see what breaks.

@djhoese
Copy link
Member

djhoese commented Oct 4, 2023

Ah the tests have revealed it. We still want to be able to run the function even if it can't be cached. Maybe a warning? But the warning would have to go after the check to see if caching is requested (enabled) and after we know that there is an uncacheable type (ex. SwathDefinition).

@codecov
Copy link

codecov bot commented Oct 10, 2023

Codecov Report

Merging #2585 (f189402) into main (95eb01a) will increase coverage by 0.27%.
Report is 167 commits behind head on main.
The diff coverage is 97.72%.

@@            Coverage Diff             @@
##             main    #2585      +/-   ##
==========================================
+ Coverage   94.91%   95.18%   +0.27%     
==========================================
  Files         351      354       +3     
  Lines       51215    51270      +55     
==========================================
+ Hits        48611    48802     +191     
+ Misses       2604     2468     -136     
Flag Coverage Δ
behaviourtests 4.24% <6.81%> (-0.06%) ⬇️
unittests 95.81% <97.72%> (+0.29%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
satpy/tests/modifier_tests/test_angles.py 99.50% <100.00%> (-0.50%) ⬇️
satpy/modifiers/angles.py 99.21% <96.55%> (+0.04%) ⬆️

... and 28 files with indirect coverage changes

@@ -250,6 +248,7 @@ def _hash_args(*args, unhashable_types=DEFAULT_UNCACHE_TYPES):
hashable_args = []
for arg in args:
if isinstance(arg, unhashable_types):
warnings.warn(f"Unhashable type in function signature ({type(arg)}), cannot be cached.", stacklevel=2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unsure about this. For one the stacklevel should be at least 3 in my opinion as 2 only gets us to __call__ of this helper class.

As for warning versus something else, I think it is confusing. It is perfectly valid to call these functions with an uncacheable/unhashable type like a SwathDefinition or a StackedAreaDefinition. There shouldn't be a warning for these cases. In my previous comment I pointed out that this warning needs to go after the should_cache check which checks if the user has caching enabled. Otherwise users could be getting this when they know they aren't going to cache things (uncacheable types) and they haven't even requested that it be attempted (the config key is disabled/off).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! Let me fix that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now fixed. Thanks for pointing this out.

self._cache_results(res, zarr_format)
self._warn_if_irregular_input_chunks(args, args_to_use)
self._cache_results(res, zarr_format)

# if we did any caching, let's load from the zarr files
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@djhoese while refactoring this function a bit, I got confused by this: Why do we load from the zarr files just after we cached it to disk?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. So that we're caching immediately and using those cached values.
  2. So that future calls to this function that result in the same cached results will use the same cached values and when .compute() is finally called there is no redundant computation. By that I mean if we generate these angles once but then continue to use the dask array and then a different product generates these angles and detects the cached value and uses the cached arrays, then when we finally do dask.array.compute(composite1, composite2) we would get one set of computations based on the original dask array and one set of computations based on the cached array even though they should be the same numbers.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, ok, good. I might improve the comment to explain point 2, for future us.

Copy link
Member

@djhoese djhoese left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this all looks good. I walked through the logic and it seems consistent with what was there before. I guess my only request is wondering if it can be split into helper functions/methods to make it any more clear? If that seems too hard I'm fine with this being merged.

@mraspaud
Copy link
Member Author

Indeed I unfolded the logic a bit to make it clearer but at the expense of the size of the function. I'll try refactoring now that you also shed light on the details I was missing.

@mraspaud mraspaud changed the title Make caching fail if one of the args is unhashable Make caching warn if some of the args are unhashable Oct 17, 2023
@mraspaud
Copy link
Member Author

I'm ready to merge this, @djhoese any more comments?

Copy link
Member

@djhoese djhoese left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small naming question that shouldn't hold up merging, but otherwise I remember agreeing with the logic before the refactor and this still looks understandable so 👍

satpy/modifiers/angles.py Outdated Show resolved Hide resolved
@djhoese djhoese merged commit ce41239 into pytroll:main Nov 15, 2023
16 of 17 checks passed
@djhoese
Copy link
Member

djhoese commented Nov 15, 2023

I think this has been good for a while, we just forgot to merge it. Merging now...

@djhoese
Copy link
Member

djhoese commented Nov 15, 2023

Oh great, CI on main is failing now! I'll see if I can fix it tonight.

@mraspaud
Copy link
Member Author

Wtf?

@djhoese
Copy link
Member

djhoese commented Nov 15, 2023

 FAILED satpy/tests/modifier_tests/test_angles.py::TestAngleGeneration::test_cache_get_angles[_get_angle_test_data_rgb_nodims-9-exp_zarr_chunks5-_diff_sat_pos_datetime-False-6-True] - AssertionError: assert 17 == (9 * (1 + 1))
 +  where 17 = <MagicMock name='get_observer_look' id='1888278755600'>.call_count
 +  and   1 = int(True)

@djhoese
Copy link
Member

djhoese commented Nov 15, 2023

Might be a hiccup or race condition.

@mraspaud
Copy link
Member Author

Or could the caching behaviour have changed and change the call count

@djhoese
Copy link
Member

djhoese commented Nov 15, 2023

Restarting just to see...

@djhoese
Copy link
Member

djhoese commented Nov 15, 2023

Restarting and it seems to pass

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants