Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Drupal cache tags in the revalidate functionality #763

Open
bojanbogdanovic opened this issue May 20, 2024 · 11 comments
Open
Labels
enhancement New feature or request triage A new issue that needs triage

Comments

@bojanbogdanovic
Copy link
Contributor

Package

Not package-specific

Describe the feature request

Currently next-drupal only supports revalidating by path for the node that is being added/updated/deleted. Yes, you can specify additional paths per bundle, but that is only applicable if you are dealing with predefined paths. I'm currently working on a project that is multilingual but most importantly it is working with dynamic routes with catch-all segments. This means any node can have a different path (different from the URL alias pattern). This becomes a problem when referenced entities are being used in the node, e.g: Media, Taxonomy term, other Nodes. Dependencies are not being revalidated, e.g: updating a taxonomy term name that is being referenced on a node, won't trigger the revalidate on the associated node and the change won't be visible on the frontend.

Describe the solution you'd like

If we look at Drupal how it deals with dependencies on data, the cache tags from the Cache API is a great example. For the project that I'm working on, I have introduced an event subscriber that tracks and registers the cache tags on API calls coming from the frontend. With this info it's possible to resolve all associated nodes that need to be revalidated. For example: taxonomy term name has been updated, by querying into the tracked cache tags, I'm able to pinpoint all the nodes that are referencing this taxonomy term and revalidate them by path.

Describe alternatives you've considered

I have consider to register the cache tags on the frontend side, in a static JSON file or in Redis cache. But because in the project we are dealing with allot of data (entities), a update from a single entity could potentially trigger many pages to be revalidated, which could introduces performance issues. That's why a queue would be advisable and luckily Drupal has one out of the box. Also the registered cache tags need be queried by context, e.g: langcode and next site, hence a table in the database would be more appropriate.

Additional context

This implementation is already up and running on the project that I'm working on, the question is would this be interesting for the community if I would contribute this to next-drupal?

Cache tags next-drupal
@bojanbogdanovic bojanbogdanovic added enhancement New feature or request triage A new issue that needs triage labels May 20, 2024
@blyme
Copy link
Contributor

blyme commented Sep 5, 2024

Hi @bojanbogdanovic This sounds very promising and something I'm very much interested in being added to the module! Are your project using the ^2.0.0-beta0 or ^1.6 version?

@xendk
Copy link

xendk commented Sep 5, 2024

@bojanbogdanovic
Interesting...

I have introduced an event subscriber that tracks and registers the cache tags on API calls coming from the frontend.

Could you elaborate on this? As far as I know, most of the caching metadata is produced as part of the rendering process, but as I understand it, Next requests the content data rather than rendering. So how does the event listener catch the tags?

@bojanbogdanovic
Copy link
Contributor Author

Well we are currently on 1.6, but we are migrating soon to 2.0, there aren’t really BC’s regarding this cache tag invalidation logic.

We are using JSON:API to build the static pages, with this setup it’s important to use JSON:API includes, because in the mentioned eventsubscriber (KernelEvents::RESPONSE, onResponse), the cache tags of all referenced entities are only available if they are included (eventsubscribers from the JSON:API module makes this possible).

Another approach that I’m looking into is to set cache tags on the request, and leveraging the revalidateTag function from Next.js 14. This approach has its advantage that it can incrementally invalidate specific requests/server components.

@blyme
Copy link
Contributor

blyme commented Sep 6, 2024

@bojanbogdanovic would be open to publish your code so we can take a look - maybe a PR?

@bojanbogdanovic
Copy link
Contributor Author

I have pushed to code, so please take a look ;)

Some small notes:

  • You need to add "X-NextJS-Site" header in the DrupalClient (on the frontend), so that the backend is context aware of the next site. The value needs to match the ID of the NextSite entity.
  • Cache tags will only work for entities that are configured with the "Cache tag" revalidator plugin, which can be managed on: "/admin/config/services/next/entity-types".
  • I would advise to install the https://www.drupal.org/project/queue_ui module, which gives mores insight and control on the queue.

@backlineint
Copy link
Collaborator

First off, @bojanbogdanovic - thanks for the work on this, and great to hear that you have cache tag based invalidation functioning on a project.

Looking at this PR I've been confused as to why it is necessary to store a record of cache tags in the database at all. But clearly this is a well thought out (and working) implementation, so I must have been missing the reasoning as to why this was necessary. It wasn't until reviewing this PR again today that I realized the reason why.

This implementation is for Next.js' Pages Router, which technically only has native support for invalidating by path. Your implementation bridges that using this new database table.

I've been thinking about this from the perspective of the App Router, which actually has built in support for invalidation by cache tag. Using that implementation I don't think we'd need to track/map tags internally at all, just pass the existing tags to the API endpoint during revalidation events.

The split between the Pages and App router makes things like this difficult, but since the App Router supports this natively and is now the default for newly created Next.js projects, I personally think it would make more sense for Next-Drupal's cache-tag invalidation to be focused on the app router.

That said, the pages router version could still be useful to others. Is there any way this could be offered as a contrib module?

@backlineint
Copy link
Collaborator

If we move forward with the App router approach, some quick notes on the changes I think we'd need:

  • For a next.js site (the config entity) we'd need to add an option under on-demand revalidation to select if we should invalidate by path, or by tag.
    ** Is there a real-world use case for needing both? That would complicate this feature quite a bit.
  • modules/next/modules/next_extras/src/NextCacheInvalidator.php will need new getTagsToInvalidate and invalidateTags methods.
  • invalidateEntity will need to add conditional logic to check for the invalidation type and adjust accordingly.

@backlineint
Copy link
Collaborator

Also of note: #784 has a start on what is required on the front end for this.

@bojanbogdanovic
Copy link
Contributor Author

Hi @backlineint,

Thanks for your input!

This implementation is for Next.js' Pages Router, which technically only has native support for invalidating by path. Your implementation bridges that using this new database table.

This is true, but also it gives a queue mechanism which is handy with a SSG (Static Site Generation) setup. It solves the issue that too many pages get revalidated at the same time.

The split between the Pages and App router makes things like this difficult, but since the App Router supports this natively and is now the default for newly created Next.js projects, I personally think it would make more sense for Next-Drupal's cache-tag invalidation to be focused on the app router.

I agree, like I said in #763 (comment) I'm looking into this approach and I also believe this is the way forward. The only concern that I have is what if the same resource call is being made across multiple pages, with Static Rendering this would still mean that a single entity update could trigger revalidation on many pages.

@backlineint
Copy link
Collaborator

This is true, but also it gives a queue mechanism which is handy with a SSG (Static Site Generation) setup. It solves the issue that too many pages get revalidated at the same time.

Could certainly see the value of a queue in general beyond the scope of this issue. We might still need a supporting db table in that case (not sure tbh) but it seems like it would serve a different purpose - tracking the invalidation events (for either pages or tags) rather than the cache tags themselves.

I agree, like I said in #763 (comment) I'm looking into this approach and I also believe this is the way forward.

Missed that comment - sorry. But yeah, glad we're on the same page with this.

The only concern that I have is what if the same resource call is being made across multiple pages, with Static Rendering this would still mean that a single entity update could trigger revalidation on many pages.

The app router version of this should be substantially more efficient. Rather than having to invalidate the paths for all of the related tags resulting in many API calls, we should be able to pass all the cache tags for invalidation in a single API call. This issue outlines what the change to our API endpoint would be: #807 Still could benefit from a queue, but should require far fewer API calls compared to the approach that was required in the current PR.

@backlineint
Copy link
Collaborator

Took a shot at a new issue for the app router version of this concept: #812

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request triage A new issue that needs triage
Projects
None yet
Development

No branches or pull requests

4 participants