-
Notifications
You must be signed in to change notification settings - Fork 385
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test AMP compatibility of entire site #1183
Conversation
Define get_post_permalinks(). This gets permalinks for public posts, other than attachments. Another helper function can call AMP_Validation_Manager::validate_url() for each of these permalinks.
validate_queued_posts_on_frontend() uses IDs, which are stored in $posts_pending_frontend_validation. So simply return IDs from the helper function.
Before, $post_ids wasn't assigned as an array(). This addresses a failed Travis build.
Querying for ID was simply to make testing easier. So remove that, and fall back to the default 'orderby' => 'date'.
foreach ( $site_post_ids as $id ) { | ||
AMP_Validation_Manager::$posts_pending_frontend_validation[ $id ] = true; | ||
} | ||
return AMP_Validation_Manager::validate_queued_posts_on_frontend(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you suggested, this validate_queued_posts_on_frontend
method is probably not relevant. The main thing will be to:
- Obtain the list of URLs on the site, including homepage, posts, pages, categories, archives, etc. A good sitemap XML generator plugin should have the logic you need for this.
- If in paired mode, append
?amp
to each of the URLs discovered. This will be used instead ofamp_get_permalink()
because the endpoint will be eliminated when AMP theme support is present. See Discontinue use of amp endpoint in favor of query var when amp theme support is present #1148. Some consideration in paired mode may be needed for whether AMP is available or not for a given post. - For each AMP URL, call
\AMP_Validation_Manager::validate_url()
and pass the results into\AMP_Invalid_URL_Post_Type::store_validation_errors()
. - Chunk the results to prevent timing out. This is particularly key in WP-Cron and Ajax. In WP-CLI this isn't relevant, and a progress meter should be shown.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your fast reply, @westonruter! These details really help.
This class is not going to use this method anymore: validate_queued_posts_on_frontend(). So this will need the permalinks, not IDs.
For example, http://example.org/?cat=2 This includes categories and tags, and any more that are registered. But it excludes post_format links.
* @param int $number_links The maximum amount of links to get (optional). | ||
* @return string[] $links The term links in an array. | ||
*/ | ||
public static function get_taxonomy_links( $number_links = 200 ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There should be pagination support here. The WP_Term_Query
supports an $offset
arg. I suggest that this method only return links for one specific taxonomy. The site crawler will probably need to make multiple calls to such a method to iterate over all of the taxonomy terms in batches. One way to do it would be to get the terms sorted by ID in ascending order. That ensures that when you paginate through the terms you won't miss terms that get added while you're processing through them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, good ideas. These commits apply your suggestions, as I understand them. This now has $taxonomy
and $offset
parameters.
'public' => true, | ||
) ); | ||
|
||
// It doesn't seem necessary to get links for post format terms, like asides, galleries, or quotes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I think it is necessary. There are archive links for post formats: https://codex.wordpress.org/Function_Reference/get_post_format_link
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, commit db54a removes the exemption of post_format
terms.
I think the “Check Site” button would make sense on the “Invalid AMP Pages” screen. Clicking it should show a spinner and some progress indicator probably. And as noted on #1184 (comment):
Having a “Check Site Compatibility” button would make a lot of sense on such a Theme Support admin screen. As part of this, we should allow for the |
Before, I had unset() these from the array. But as Weston pointed out, post_format terms can have archive links.
Add a $taxonomy parameter, to only get the terms for a given taxonomy.
As Weston mentioned, this helps to get the links in batches. Also, simplify the test.
Props 10up Engineering Best Practices. The prevents the need to use wp_list_pluck() to get the ids.
A similar change to the one Weston proposed for get_taxonomy_links(). Get permalinks by post types, and allow paging through them. This will allow iteration, in a way similar to that in get_taxonomy_links().
Iterate over all of the terms and posts from the existing helper methods. @todo: consider self::BATCH_SIZE, and whether the validation requests should be throttled, or somehow prevented from timing out WP.
In PHP 5.3, there was: Fatal error: Cannot access self:: when no class scope is active. https://travis-ci.org/Automattic/amp-wp/jobs/386045157. This looks to be because self isn't available inside the function in 5.3. Another option might be to add $self to use(). But this seems simpler, if it works.
This will enable crawling the site in Native AMP. This seems simpler than registering Paired Mode support, and adding template_dir. But it's still an open discussion as to whether it should only be Native AMP. If this already finds theme support, it doesn't change it.
In the validate_url() helper function, find if it's in Paired Mode. If so, set the $url to the amp endpoint.
This mainly calls the existing helper function. But it also passes it $wp_cli_progress. This advances the progress bar, after each post type or taxonomy validation completes.
Success: 196 URLs were crawled, and 196 have AMP validation issue(s). Query for the validation error posts. And output the number found.
This drives to the page: Invalid AMP Pages (URLs) Also, output a message at the beginning: Crawling the entire site to test for AMP validity. This might take a while...
That UI for WP-CLI looks perfect 👌 |
bin/validate-site.php
Outdated
} else { | ||
echo "Please run this script with WP-CLI via: wp eval-file bin/validate-site.php\n"; | ||
exit( 1 ); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of adding this as a bin
script it should get registered as an actual command with WP-CLI. For example, wp amp validate-site
. The command needs to be included with the plugin when it is released, and the bin
directory is just for development.
} | ||
|
||
if ( $wp_cli_progress ) { | ||
$wp_cli_progress->tick(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could get de-coupled from WP-CLI by passing a $progress_callback
function instead which is called, and then you could call the method like so:
\AMP_Site_Validation::validate_entire_site_urls( function() use ( $wp_cli_progress ) {
$wp_cli_progress->tick()
} )
bin/validate-site.php
Outdated
*/ | ||
function crawl_site() { | ||
\WP_CLI::log( 'Crawling the entire site to test for AMP validity. This might take a while...' ); | ||
$count_post_types_and_taxonomies = count( get_post_types( array( 'public' => true ), 'names' ) ) + count( get_taxonomies( array( 'public' => true ) ) ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should count the number of posts and terms, not just the number of types.
$offset = 0; | ||
|
||
while ( ! empty( $permalinks ) ) { | ||
self::validate_urls( $permalinks ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per above, the tick should be done with each URL that is processed, not after each type is processed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great idea. Commit 3d9a01 calls tick()
for every URL validated.
Hi @westonruter, If it's alright, I'm going to leave this work for the next phase. |
3 assertions failed, so fix these both in the tested class and the PHPUnit class.
On Weston's suggestion. Before, running the validation required: wp eval-file bin/validate-site.php.
This has a much better display of the activity. On Weston's suggestion. Create a new method to count all of the URLs to be validated. This should only be used for WP-CLI, as it uses 'posts_per_page' => -1
@kienstra Also, this is important for AC2 to be satisfied:
If I'm in Classic mode, currently the result is:
So that needs to be improved so that site validity can be checked for AMP theme support prior to switching it on. To achieve that, I think the following needs to be done:
|
Hi @westonruter, |
On Weston's suggestion, as this would not be able to validate any URL without the --force flag.
As Weston suggested, if in Classic mode and --force is passed, force theme support in the crawled URLs. This is now done in read_theme_support(), thought there might be a better way.
Hi @westonruter, the 2 commits above apply your suggestions for forcing theme support when in Classic mode. I hope they capture what you had in mind. Here's the error that appears when in 'Classic' mode and |
self::$force_crawl_urls = true; | ||
} | ||
|
||
if ( ! current_theme_supports( 'amp' ) ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This conditional will capture the Template Mode being Classic, though maybe you had in mind a more narrow check, like:
! current_theme_supports( 'amp' ) && 'disabled' === AMP_Options_Manager::get_option( 'theme_support' )
* Only show admin URL to review results if theme support initially present. * Skip showing table if no results obtained.
… and taxonomy * Hide the admin menu links for invalid URLs and validation errors by default in classic mode. * Invoking `wp amp validate-site` will cause the post type and taxonomy to be populated. * When in classic mode, show links to invalid pages and validation errors with template mode as opposed to in admin menu. * When viewing the admin screens for the post type and taxonomy, show the admin menu links for both.
The last piece here seemed to me to be providing a way for the user to access the validation results for their |
if ( current_theme_supports( 'amp' ) ) { | ||
return true; | ||
} | ||
if ( 'edit.php' === $pagenow && ( isset( $_GET['post_type'] ) && self::POST_TYPE_SLUG === $_GET['post_type'] ) ) { // WPCS: CSRF OK. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not return the result of the conditional instead of explicitedly retuning true
or false
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Done in a83a569.
if ( current_theme_supports( 'amp' ) ) { | ||
return true; | ||
} | ||
if ( 'edit-tags.php' === $pagenow && ( isset( $_GET['taxonomy'] ) && self::TAXONOMY_SLUG === $_GET['taxonomy'] ) ) { // WPCS: CSRF OK. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here....return the result of the conditional expression instead of explicit true
or false
. It’s less code and more clear.
includes/class-amp-cli.php
Outdated
WP_CLI::success( | ||
sprintf( | ||
/* translators: $1%d is the number of URls crawled, $2%d is the number of validation issues, $3%d is the number of unaccepted issues, $4%s is the list of validation by type, $5%s is the link for more details */ | ||
__( '%3$d crawled URLs have unaccepted issue(s) out of %2$d total which AMP validation issue(s); %1$d URLs were crawled.', 'amp' ), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe
total which AMP validation issue(s)
should be:
total with AMP validation issue(s)
Commands Work As Expected Hi @westonruter,
Also, great idea to add links for the invalid URLs and errors in Classic mode: |
Hi @westonruter - really like the progress here and thanks for tagging this in #1359. Also wanted to call out #1273 here - the idea I spelled out there could try to surface errors next to where toggles exist (so that a user is more likely to select modes when they might be afraid to do this). I'm not entirely sure this is necessary or how straightforward implementing it might be, but it's captured in the backlog regardless. Both that feature (and #1273) are trying to take an approach to accomplish the inverse goals of #1366: Instead of surfacing settings info in the error listings, surface the error listings in the settings. The key decisions I still think are up for discussion (past cc: @jwold |
Pull Request For #1007
This crawls the site, and test for AMP validity by template/content type.
Fixes #1007.