Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fallback to cache error #60

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

ottusp
Copy link
Contributor

@ottusp ottusp commented Jul 27, 2024

More context on this issue.

This change solves the problem of cache errors preventing the view action to be called. Besides, it adds the option of passing a function to be executed when cache fails.

ottusp added 2 commits July 27, 2024 20:14
When cache fails, exceptions were not captured, so the whole view call would fail.

Since a cache hit attempt is an optional feature (it just saves performance), ignoring cache on cache errors is a more resilient approach.

Closes flamingo-run#59

with (
patch("django.core.cache.backends.locmem.LocMemCache.get", side_effect=Exception),
patch("test_app.views.handle_cache_error_helper") as error_handler,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This patch makes the handle_cache_error_helper method instead of the handle_cache_error. For some reason mocking handle_cache_error did not work, so I added this boilerplate to make tests reliable.

def _get_cached_result(self, key, request) -> dict | None:
try:
return self.cache.get(key)
except Exception as exc:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should handle specific errors here, and for this we must test extensively what can go wrong.
Exception is too broad and it might suppress some dumb coding errors such as AttributeError, which is easily fixed and not related to cache itself.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get that handling Exception is not the best practice when the code block inside try is proprietary code, but actually self.cache.get calls a cache inside user code (as cache is defined by them). Therefore, by passing the exception to cache_error_func, we are giving the user the power to deal with whatever cache-specific error it might have.

However, as you said, I do see that user might have configured cache just wrong, and both self.cache, self.cache.get can be None (leading to AttributeError) or self.cache.get might require a different number of parameters (leading to a TypeError). In these two cases, I do see benefit in handling exceptions inside drf-toolkit code. Else, I think we should let the user handle whatever error occurs (thanks for the suggestion!).

Besides, considering we might not want to hide errors from someone who should know about it, we can make this "ignore error" behavior a default disabled? That way, someone has consciously choose to ignore errors (or handle them in a custom manner). Wdyt?

@@ -38,6 +38,25 @@ class BodyCacheKeyConstructor(CacheKeyConstructor):


class CacheResponse(decorators.CacheResponse):
def __init__(self, timeout=None, key_func=None, cache=None, cache_errors=None, cache_error_func=None):
"""
:param cache_error_func: Function to be called when an error occurs while trying to get the cache.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about the general DX of this solution here. First, I only partially agree with what was stated in the issue description:

@cache_response decorator is used to apply cache to view sets. Usually, attempts to fetch cached data is a "nice to have" to any view call in terms of performance. Therefore, any error on cache should not fail the view call. Instead, a more resilient approach is to ignore cache on errors, and proceed to usual view call.

Attempts to fetch cached data are, indeed, usually "nice to have", but emphasis on usually: for an endpoint that is heavily accessed and has a cache hit ratio well over 90%, it is not a nice to have, it is part of the infrastructure. If the cache fails and we send all requests upstream as if there was no cache, we're likely to cause a self-inflicted DDoS since the load will increase drastically in a moment's notice. If this unexpected load top up database capacity, we might have a bigger (broader) availability problem than just the errors on the cached endpoint.

In that sense, while we could have a default behavior of calling the view, perhaps this should be configurable on specific endpoints in which this is not a real option (especially considering drf_kit is not Nilo-specific). Since we take a cache_error_func, this is actually already possible (all that is needed is re-raising the exception on cache_error_func), so perhaps there is nothing needed to address the "GET" case.

As for the "SET" case, it is a bit more complicated: besides the same considerations done for the "GET" case, we need to consider that there are two possible scenarios:

  • There is no cached data for the endpoint. In this case, not setting the cache hurts performance, but doesn't hurt correctness.
  • There is already cached data for the endpoint. In this case, not setting the cache doesn't hurt performance (as future GET requests will hit the cache), but it hurts correctness (a stale value will be read)

So I don't think ignoring cache fails for updates should be the default behavior since it introduces data inconsistencies (but I'm okay with apps that use drf kit being able to opt-in for this behavior)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! I was thinking over of whether cache was indeed a "nice to have", and you make it clear that it might be, but might not be as well.

Just to make clear, your suggestion is to make it disabled default , and the user has to opt-in to have their error ignored? If it is, sounds good to me. The set method can be something like:

        try:
            self.cache.set(key, response_dict, self.timeout)
        except Exception as exc:
            if self.cache_error_func:
                self.cache_error_func(exception=exc, key=key, request=request, get_cache=False)
            else:
                raise from exc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants