-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Navigation Block: Properly decode URL-encoded links #46435
Navigation Block: Properly decode URL-encoded links #46435
Conversation
👋 Thanks for your first Pull Request and for helping build the future of Gutenberg and WordPress, @kozer! In case you missed it, we'd love to have you join us in our Slack community, where we hold regularly weekly meetings open to anyone to coordinate with each other. If you want to learn more about WordPress development in general, check out the Core Handbook full of helpful information. |
@danielbachhuber when I run those tests using: npm run test:php the test fails. |
@kozer For me, the tests load the version of One thing that might not be obvious: the JavaScript and PHP in this repository is regularly synced into WordPress core. It looks like there was a function rename in #40657 @adamziel @gziolo Any suggestions on how we should proceed? Should we submit this change against WordPress core instead? |
@danielbachhuber I replaced wp.org file with the change I made here inside the WordPress container, and re-run the tests. They passed. |
@kozer While we're waiting, can you update the pull request description with details on the fix, how to test, etc? |
Asking about how to proceed in Slack https://wordpress.slack.com/archives/C02QB2JS7/p1671470350718839 |
@danielbachhuber it seems that e2e tests are not passing, however, I don't see how my change is relevant. Should I update the test to make it pass? |
#46662 looks like a reasonable way forward. In hindsight I don't think removing the prefix wasn't a good decision. Only the e2e tests are failing now. They tend to be flakey so I just restarted them 🤞 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interestingly, the behavior works as expected on trunk
and breaks for me on this PR.
I opened the demonstration link from the original issue, and then copied the link from my browser:
https://api.payaconnect.com/hostedpaymentpage?id=11eacdb9c9fc0d2c989d2057&data=U2FsdGVkX18xMWVhY2RiOa8xisZ0ulzB%2FzlaRnFN0dWAM1rLyZL%2BQpUc9oo2fp616ICqcykrs%2BjPzDfEPLd%2FZA%3D%3D
It ends up stored in the database as this:
<!-- wp:navigation-link {"label":"Sample Page","id":2,"url":"https://wordpress-develop.test/?page_id=2","kind":"post-type","isTopLevelLink":true} /-->
<!-- wp:navigation-link {"label":"Testing5","url":"https://api.payaconnect.com/hostedpaymentpage?id=11eacdb9c9fc0d2c989d2057\u0026data=U2FsdGVkX18xMWVhY2RiOa8xisZ0ulzB%2FzlaRnFN0dWAM1rLyZL%2BQpUc9oo2fp616ICqcykrs%2BjPzDfEPLd%2FZA%3D%3D","kind":"custom","isTopLevelLink":true} /-->
On this branch, the rendered HTML ends up as this:
<a class="wp-block-navigation-item__content" href="https://api.payaconnect.com/hostedpaymentpage?id=11eacdb9c9fc0d2c989d2057&data=U2FsdGVkX18xMWVhY2RiOa8xisZ0ulzB/zlaRnFN0dWAM1rLyZL+QpUc9oo2fp616ICqcykrs+jPzDfEPLd/ZA=="><span class="wp-block-navigation-item__label">Testing5</span></a>
Clicking on the link produces this error:
On trunk, the rendered HTML ends up as this:
<a class="wp-block-navigation-item__content" href="https://api.payaconnect.com/hostedpaymentpage?id=11eacdb9c9fc0d2c989d2057&data=U2FsdGVkX18xMWVhY2RiOa8xisZ0ulzB%2FzlaRnFN0dWAM1rLyZL%2BQpUc9oo2fp616ICqcykrs%2BjPzDfEPLd%2FZA%3D%3D"><span class="wp-block-navigation-item__label">Testing5</span></a>
Clicking the link opens the page as expected.
I was curious why this data is escaped in the first place, given JSON serialization should be sufficient. It looks like it was added in #19679
Can you figure out these discrepancies? It'd be great to have more test cases too, covering the various different URLs we need to support.
@danielbachhuber , this should not happen now. It's fixed. The problem was that we were testing the same thing, differently on the same component. More specifically, Given that I have to propose the following:
|
@kozer Can you make sure there are test cases that cover the before and after of your changes, as well as the variety of links we want to support? |
return rawurldecode($url); | ||
} | ||
return $url; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we consolidate these two functions into one function called block_core_navigation_link_maybe_urldecode()
?
For better or for worse, maybe_*
functions are a common pattern in WordPress core: https://developer.wordpress.org/?s=maybe
It seems like there are some PHPCS issues that need to be fixed too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer having them as two separate functions, as it's clear what each one of them is doing, and we can reuse is_url_encoded
for testing encoded URLs elsewhere if we need to instead of tying those two functions together and producing a "side effect" if the URL is encoded.
However, I'll rename the urldecode_once
to maybe_urldecode
to match the common pattern in WordPress core.
Thanks for pointing this out!
It seems like there are some PHPCS issues that need to be fixed too.
Can you elaborate a bit more on this? I don't see any PHPCS issues locally for those two functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kozer Given this will eventually be included in WordPress core, I'd like them consolidated to one function to match the pattern already established in WordPress core.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@danielbachhuber Doesn't the maybe_urldecode
renaming establish that pattern?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kozer If it's only one block_core_navigation_link_maybe_urldecode()
function, then yes 😊
In the context of a large open source project, we shouldn't introduce two functions if we only need one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@danielbachhuber I made the change as requested!
A side note:
I'm not sure if the phpcs errors you mentioned are related to tabs vs spaces. Neither vscode or neovim show any warnings/errors to me.
Are tabs the way to go in PHP-related files?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if the phpcs errors you mentioned are related to tabs vs spaces. Neither vscode or neovim show any warnings/errors to me.
Are tabs the way to go in PHP-related files?
@kozer Here's the failing test: https://github.com/WordPress/gutenberg/actions/runs/3830602130/jobs/6518686126
The failing test runs npm run lint:php
:
You can run npm run lint:php
in your local environment to see all of the same errors.
I use this VS Code extension to integrate PHPCS:
Name: PHP Sniffer & Beautifier
Id: valeryanm.vscode-phpsab
Description: PHP Sniffer & Beautifier for Visual Studio Code
Version: 0.0.15
Publisher: Samuel Hilson
VS Marketplace Link: https://marketplace.visualstudio.com/items?itemName=ValeryanM.vscode-phpsab
You can see all of WordPress' PHP coding standards here: https://developer.wordpress.org/coding-standards/wordpress-coding-standards/php/
@@ -89,7 +89,7 @@ export const updateAttributes = ( | |||
|
|||
setAttributes( { | |||
// Passed `url` may already be encoded. To prevent double encoding, decodeURI is executed to revert to the original string. | |||
...( newUrl && { url: encodeURI( safeDecodeURI( newUrl ) ) } ), | |||
...{ url: newUrl ? encodeURI( safeDecodeURI( newUrl ) ) : newUrl }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still not super clear why it was necessary to encodeURI()
here in the first place. It seems like https://example.com?s=<>
is JSON-encoded just fine:
>> var test = {};
undefined
>> test.url = 'https://example.com?s=<>';
"https://example.com/?s=<>"
>> JSON.stringify(test);
"{\"url\":\"https://example.com/?s=<>\"}"
Maybe because the <>
gets stripped out by kses? Can you debug and document why #19679 was necessary in the first place? If it's not necessary, maybe we can remove it entirely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@danielbachhuber you were right. The reason that this was stripped out is because of esc_url
function that uses kses
under the hood.
So I think this is the way to go here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason that this was stripped out is because of
esc_url
function that useskses
under the hood.
@kozer For posterity, can you provide a full step-by-step documentation of what's going on?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For sure @danielbachhuber! So, the process is the following:
- User enters a new URL via the editor.
- Before
encodeURI
introduced the URL was stored as is. - After saving, when someone navigates to the page with the inserted URL, the URL is passed in the
render_block_core_navigation_link
function. - When rendered, the URL is passed through
esc_url
.esc_url
uses thepreg_replace
function that seems to discard<
and>
characters:
$url = preg_replace( '|[^a-z0-9-~+_.?#=!&;,/:%@$\|*\'()\[\]\\x80-\\xff]|i', '', $url );
- Using
encodeURI
in a URL that has those characters, (eg:encodeURI('http:example.com?s=<>')
encodes it, and produce an encoded URL (in the above example the URL is encoded tohttp:example.com?s=%3C%3E
, and so,esc_url
no longer strips out those characters.
This is the reason why encodeURI
was introduced originally in #19679, and produce the bug we are now facing.
I also put those steps in the description of this pr. Given that, I assume this is ok to be merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your persistence on this, @kozer !
I appreciate that you updated the PR description with "Some context on why this bug was introduced" too.
Congratulations on your first merged pull request, @kozer! We'd like to credit you for your contribution in the post announcing the next WordPress release, but we can't find a WordPress.org profile associated with your GitHub account. When you have a moment, visit the following URL and click "link your GitHub account" under "GitHub Username" to link your accounts: https://profiles.wordpress.org/me/profile/edit/ And if you don't have a WordPress.org account, you can create one on this page: https://login.wordpress.org/register Kudos! |
Refers to: #46159
What?
The link is encoded when adding a URL with a special character (eg: %) in a navigation block link. However, the URL is not decoded properly when visiting the page, breaking the navigation.
Why?
This pr solving the above issue, decoding the URL properly.
How?
This was an easy fix. Just using PHP's
urldecode
function, seems to solve the underlying issue.Testing Instructions
Testing Instructions for Keyboard
Screenshots or screencast
Some context on why this bug was introduced.
encodeURI
was introduced the URL was stored as is.render_block_core_navigation_link
function.esc_url
.esc_url
uses thepreg_replace
function that seems to discard<
and>
characters:encodeURI
in a URL that has those characters, (eg:encodeURI('http:example.com?s=<>')
encodes it, and produces an encoded URL (in the above example the URL is encoded tohttp:example.com?s=%3C%3E
, and so,esc_url
no longer strips out those characters.This is the reason why
encodeURI
was introduced originally in #19679, and produce the bug we are now facing.