Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use persistent caching around _only_ the theme.json files #5267

Draft
wants to merge 10 commits into
base: trunk
Choose a base branch
from

Conversation

felixarntz
Copy link
Member

@felixarntz felixarntz commented Sep 20, 2023

While a complexity that has made it complicated to progress https://core.trac.wordpress.org/ticket/57789 has been the additional filters firing around theme.json parsing, this PR explores an approach that is not subject to that complexity: By persistently caching only the raw results of the theme.json files (i.e. none of the following parsing), it already saves a lot of processing time on every page load (for sites that use a persistent object cache).

The impact of this is already quite notable, with wp-total being 4.6% faster for TT1 (classic theme) and 3.9% faster for TT3 (block theme). See also WordPress/gutenberg#45616 which is another related idea that focused on the cost of just the JSON decoding of the files while just by itself has a notable negative performance impact. See this spreadsheet for the full Server-Timing benchmarks.

This PR is currently for testing & benchmarking purposes only. It should preferably be implemented in Gutenberg before being merged into core. If there is consensus on the approach though, I think this would be worth prioritizing still for WordPress 6.4 given the notable performance impact for sites using a persistent object cache.

Keep in mind that this only has an impact when using a persistent object cache. If we wanted to bring these benefits to all WordPress sites, we could consider using transients instead.

Trac ticket: https://core.trac.wordpress.org/ticket/57789


This Pull Request is for code review only. Please keep all other discussion in the Trac ticket. Do not merge this Pull Request. See GitHub Pull Requests for Code Review in the Core Handbook for more details.

Comment on lines -241 to +293
$wp_theme = wp_get_theme();
$theme_json_file = $wp_theme->get_file_path( 'theme.json' );
if ( is_readable( $theme_json_file ) ) {
$theme_json_data = static::read_json_file( $theme_json_file );
$theme_json_data = static::translate( $theme_json_data, $wp_theme->get( 'TextDomain' ) );
} else {
$theme_json_data = array();
}
$wp_theme = wp_get_theme();

// Read main theme.json (which may also come from the parent theme).
$raw_theme_json_data = static::read_theme_json_data_for_theme( $wp_theme );
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this change here actually includes a bug fix for something I just noticed while working on it:

  • WP_Theme::get_file_path() may return the parent theme's theme.json file.
  • But then below it is translated using the child theme's text domain, which will lead to translations not to work correctly given the text domain is probably different for the parent and child theme.

The way I have updated the code now this is no longer a problem.

Copy link
Member

@joemcgill joemcgill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a really good idea. In my testing for #5024, I've noticed that the most expensive operations during Theme JSON processing is the core data and the user data, whereas the data for blocks and theme have less of an impact.

While this would only target core and theme data, I think the idea is still worth pursuing. I ran some profiles with this PR and I'm not seeing the same degree of impact that you've reported. It also seems like most of the time in the get_theme_data method is spent constructing the WP_Theme_JSON_Data objects, which is unaffected by your changes.

We could cache the WP_Theme_JSON_Data object before it's passed to the wp_theme_json_data_theme filter, since that should not change unless the theme is updated, which could help, or further improve the performance of that class constructor.

See profiles:

Trunk

image

Current PR

image

@felixarntz
Copy link
Member Author

felixarntz commented Sep 22, 2023

Thanks @joemcgill for the feedback.

In my testing for #5024, I've noticed that the most expensive operations during Theme JSON processing is the core data and the user data, whereas the data for blocks and theme have less of an impact.

That's certainly possible. I only tested overall performance, but didn't profile. Could you share the data for how much the 4 different processes took? Since the profile below is only for get_theme_data()? Another question, did you test with a classic or a block theme, and with a persistent object cache running? I'd expect the changes to get_theme_data() to be impactful for block themes only.

I ran some profiles with this PR and I'm not seeing the same degree of impact that you've reported.

Of course the impact between benchmarks differs, but I could see why with a profiler that impact won't be as large: While in many cases the overhead that profiling tools add to execution time don't matter (since relative speed of different functions should remain somewhat similar), most of the performance cost that this PR improves is from JSON decoding. That itself probably doesn't get more expensive with or without profiling because it's a single native PHP function - while generally WordPress is slower with a profiler running. So the cost of JSON decoding seems lower than it actually is. Certainly this is just an assumption, but I'd be curious to see additional benchmarks to assess the impact. I know we've had this discussion in the past, but I think the PR here may be one of the examples why I am hesitant to use profilers to quantify impact of a change.

It also seems like most of the time in the get_theme_data method is spent constructing the WP_Theme_JSON_Data objects, which is unaffected by your changes.

I'm not sure about that. When you say "unaffected", do you mean that in terms of little performance impact? Because the changes here should certainly affect get_theme_data() as it reads a theme.json file.

We could cache the WP_Theme_JSON_Data object before it's passed to the wp_theme_json_data_theme filter, since that should not change unless the theme is updated, which could help, or further improve the performance of that class constructor.

That could work. Have you verified that there are no extension points (e.g. filters) firing within the WP_Theme_JSON_Data constructor logic (which also instantiates a WP_Theme_JSON)? I only wanted to touch the reading JSON file parts because in that one I could be sure there's no extension points.

@joemcgill
Copy link
Member

Could you share the data for how much the 4 different processes took?

Of course, sorry, meant to link to the previous profile that I posted to the ticket, which shows the profile for WP_Theme_JSON_Resolver::get_merged_data(), which shows all of the individual getters as child processes. See: https://core.trac.wordpress.org/ticket/57789#comment:23

WordPress is slower with a profiler running. So the cost of JSON decoding seems lower than it actually is.

I understand your assumption here, but I'm not convinced that it's accurate, given the way XHProf is designed. What's more likely is that there are differences in the time it's taking your local system to do the file reads when compared with mine. As we've seen in other places while profiling, file I/O operations can show greater variability in performance results. It could be that I'm just not reproducing the conditions where those operations are taking a long time because of opcode cache or some other side effect.

When you say "unaffected", do you mean that in terms of little performance impact?

Yes, "unaffected" was a poor word choice. What I meant was that I was seeing little difference in the performance impact of this approach in an A/B test against trunk.

I'll run another round of profiles to see if I can verify the same results.

@felixarntz
Copy link
Member Author

felixarntz commented Sep 25, 2023

@joemcgill Right, I certainly don't want our work here to be based on assumptions. Though, can you please clarify how exactly you profiled this (which theme? object cache enabled? etc)? As in what environment did you use to get the profiles from #5267 (review) and https://core.trac.wordpress.org/ticket/57789#comment:23?

Copy link
Member

@oandregal oandregal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading JSON files already results in cached data, just in static variables within this class.

If we wanted to make them persistent across requests, I presume we should remove the existing caching and use the wp_cache_* instead? As it is, it's adding a new level of caching over preexisting cached data.

Another thought I have is: what would be the invalidation scenarios for these caches? For the other theme.json related caches we invalidated upon theme switching.

@felixarntz
Copy link
Member Author

@oandregal

Reading JSON files already results in cached data, just in static variables within this class.

If we wanted to make them persistent across requests, I presume we should remove the existing caching and use the wp_cache_* instead? As it is, it's adding a new level of caching over preexisting cached data.

Those caches are not the same: The caches that use static variables today cache data that runs WordPress filters and therefore should not be cached persistently, as otherwise the filters won't run.

What I am proposing here is not concerning in that regard as reading JSON files doesn't fire any filters. That's why the caches proposed here can be persistent, while the existing ones can't. If we think that the existing static variable caches don't add much performance benefit on top of what's added here, we could consider removing them, but I don't know whether that's the case, and if so it should be explored in a separate issue.

TL;DR the two means of cache here are not interchangeable.

Another thought I have is: what would be the invalidation scenarios for these caches? For the other theme.json related caches we invalidated upon theme switching.

Invalidation as in the example you're mentioning isn't needed here, since the caches proposed operate per theme. In other words, it doesn't matter which the current theme is. Updating the theme will naturally rely on a different cache due to a version change. One thing to probably add here is an expiration, to make sure we don't keep stale theme version caches around indefinitely.

@felixarntz
Copy link
Member Author

@joemcgill Can you please follow up on my question in #5267 (comment) when you get a chance? I would like to understand better what environment and configuration you used for the profiles.

@oandregal
Copy link
Member

Those caches are not the same: The caches that use static variables today cache data that runs WordPress filters and therefore should not be cached persistently, as otherwise the filters won't run.

🤔 Let me share what I'm seeing:

  • We already have caches such as static::$theme_json_file_cache and static::$i18n_schema for the contents of the JSON files (core, theme, and translation schema). These caches are filled before any filter is called. They are implemented as static variables, so they cannot persist across request.
  • This PR introduces core_${wp_version}, theme_${stylesheet}_${version}, i18n_schema_${version} holding exactly the same data as the existing ones. Though, they are implemented via wp_cache_* so they can be persisted.

My question is: why don't we remove static::$theme_json_file_cache and static:$i18n_schema if we are introducing caches that can be persisted via wp_cache_*? Does this help?

Invalidation as in the example you're mentioning isn't needed here, since the caches proposed operate per theme.

This is what I see:

  • The static:$i18n_schema is indeed cleared via clean_cached_data that is executed upon theme switches (filters switch_theme and start_previewing_theme). If we introduce the same cache via i18n_schema_${version}, why shouldn't it be cleared when its not-persisted version was?

  • The static::$theme_json_file_cache is not cleared at any point. However, the corresponding core_${wp_version} and theme_${stylesheet}_${version} caches introduced by this PR are going to be persistent, hence they's more risk of becoming stale. I guess the question we should respond is: is there any case in which the version doesn't change but the underlying theme.json does?

@felixarntz
Copy link
Member Author

@oandregal Thanks for clarifying, now I get what you mean. Previously I was thinking about the static variables caching WP_Theme_JSON instances (e.g. the $core, $blocks etc variables). But you're right, I agree that those two static variables you mention wouldn't have any additional value if we introduce this persistent cache mechanism, so they could indeed be removed. 👍

@felixarntz
Copy link
Member Author

@oandregal In a7d99d7, I've removed the properties you've mentioned as they are now redundant due to the WP cache API being used.

@felixarntz
Copy link
Member Author

I've implemented an equivalent Gutenberg pull request WordPress/gutenberg#56095, ready for review, including recreatable benchmarks using compare-wp-performance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants