Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Navigation Block: Properly decode URL-encoded links #46435

Merged

Conversation

kozer
Copy link
Contributor

@kozer kozer commented Dec 9, 2022

Refers to: #46159

What?

The link is encoded when adding a URL with a special character (eg: %) in a navigation block link. However, the URL is not decoded properly when visiting the page, breaking the navigation.

Why?

This pr solving the above issue, decoding the URL properly.

How?

This was an easy fix. Just using PHP's urldecode function, seems to solve the underlying issue.

Testing Instructions

  • Start Gutenberg's dev environment
  • Open site editor (Appearance -> Editor)
  • Add a navigation block
    add-navigation
  • Add a link (eg: Example
  • Save changes
  • Preview the page
  • Click on the link you just added
  • You should be able to visit the link without problems.

Testing Instructions for Keyboard

Screenshots or screencast

add-link-1
add-link-2

Some context on why this bug was introduced.

  • User enters a new URL via the editor.
  • Before encodeURI was introduced the URL was stored as is.
  • After saving, when someone navigates to the page with the inserted URL, the URL is passed in the render_block_core_navigation_link function.
  • When rendered, the URL is passed through esc_url. esc_url uses the preg_replace function that seems to discard < and > characters:
	$url = preg_replace( '|[^a-z0-9-~+_.?#=!&;,/:%@$\|*\'()\[\]\\x80-\\xff]|i', '', $url );
  • Using encodeURI in a URL that has those characters, (eg: encodeURI('http:example.com?s=<>') encodes it, and produces an encoded URL (in the above example the URL is encoded to http:example.com?s=%3C%3E, and so, esc_url no longer strips out those characters.

This is the reason why encodeURI was introduced originally in #19679, and produce the bug we are now facing.

@github-actions github-actions bot added the First-time Contributor Pull request opened by a first-time contributor to Gutenberg repository label Dec 9, 2022
@github-actions
Copy link

github-actions bot commented Dec 9, 2022

👋 Thanks for your first Pull Request and for helping build the future of Gutenberg and WordPress, @kozer! In case you missed it, we'd love to have you join us in our Slack community, where we hold regularly weekly meetings open to anyone to coordinate with each other.

If you want to learn more about WordPress development in general, check out the Core Handbook full of helpful information.

@kozer
Copy link
Contributor Author

kozer commented Dec 9, 2022

@danielbachhuber when I run those tests using: npm run test:php the test fails.
It seems (from the echo command you see - will be removed after I fix this issue) that the URL isn't affected by my change.
I also checked the file inside the container and the change is there.
How is that possible? Any ideas?
Thanks!

@danielbachhuber
Copy link
Member

when I run those tests using: npm run test:php the test fails.
It seems (from the echo command you see - will be removed after I fix this issue) that the URL isn't affected by my change.
I also checked the file inside the container and the change is there.
How is that possible? Any ideas?

@kozer For me, the tests load the version of render_block_core_navigation_link() in WordPress core:

https://github.com/WordPress/wordpress-develop/blob/0cb8475c0d07d23893b1d73d755eda5f12024585/src/wp-includes/blocks/navigation-link.php#L123-L238

One thing that might not be obvious: the JavaScript and PHP in this repository is regularly synced into WordPress core.

It looks like there was a function rename in #40657

@adamziel @gziolo Any suggestions on how we should proceed? Should we submit this change against WordPress core instead?

@kozer
Copy link
Contributor Author

kozer commented Dec 9, 2022

@danielbachhuber I replaced wp.org file with the change I made here inside the WordPress container, and re-run the tests. They passed.
So your theory was correct. I'm waiting for @adambasa-dp and @gziolo's response.

@danielbachhuber
Copy link
Member

@kozer While we're waiting, can you update the pull request description with details on the fix, how to test, etc?

@kozer kozer changed the title fix: issue with already encoded urls fix Navigation Block: Special characters in Links Are Improperly decoded Dec 12, 2022
@danielbachhuber
Copy link
Member

Asking about how to proceed in Slack https://wordpress.slack.com/archives/C02QB2JS7/p1671470350718839

@danielbachhuber danielbachhuber added [Type] Bug An existing feature does not function as intended [Block] Navigation Link Affects the Navigation Link Block labels Dec 19, 2022
@kozer
Copy link
Contributor Author

kozer commented Dec 20, 2022

@danielbachhuber it seems that e2e tests are not passing, however, I don't see how my change is relevant. Should I update the test to make it pass?

@adamziel
Copy link
Contributor

adamziel commented Dec 20, 2022

#46662 looks like a reasonable way forward. In hindsight I don't think removing the prefix wasn't a good decision. Only the e2e tests are failing now. They tend to be flakey so I just restarted them 🤞

@danielbachhuber danielbachhuber self-requested a review December 20, 2022 12:40
Copy link
Member

@danielbachhuber danielbachhuber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interestingly, the behavior works as expected on trunk and breaks for me on this PR.

I opened the demonstration link from the original issue, and then copied the link from my browser:

https://api.payaconnect.com/hostedpaymentpage?id=11eacdb9c9fc0d2c989d2057&data=U2FsdGVkX18xMWVhY2RiOa8xisZ0ulzB%2FzlaRnFN0dWAM1rLyZL%2BQpUc9oo2fp616ICqcykrs%2BjPzDfEPLd%2FZA%3D%3D

image

It ends up stored in the database as this:

<!-- wp:navigation-link {"label":"Sample Page","id":2,"url":"https://wordpress-develop.test/?page_id=2","kind":"post-type","isTopLevelLink":true} /-->

<!-- wp:navigation-link {"label":"Testing5","url":"https://api.payaconnect.com/hostedpaymentpage?id=11eacdb9c9fc0d2c989d2057\u0026data=U2FsdGVkX18xMWVhY2RiOa8xisZ0ulzB%2FzlaRnFN0dWAM1rLyZL%2BQpUc9oo2fp616ICqcykrs%2BjPzDfEPLd%2FZA%3D%3D","kind":"custom","isTopLevelLink":true} /-->

On this branch, the rendered HTML ends up as this:

<a class="wp-block-navigation-item__content"  href="https://api.payaconnect.com/hostedpaymentpage?id=11eacdb9c9fc0d2c989d2057&data=U2FsdGVkX18xMWVhY2RiOa8xisZ0ulzB/zlaRnFN0dWAM1rLyZL+QpUc9oo2fp616ICqcykrs+jPzDfEPLd/ZA=="><span class="wp-block-navigation-item__label">Testing5</span></a>

Clicking on the link produces this error:

image

On trunk, the rendered HTML ends up as this:

<a class="wp-block-navigation-item__content"  href="https://api.payaconnect.com/hostedpaymentpage?id=11eacdb9c9fc0d2c989d2057&#038;data=U2FsdGVkX18xMWVhY2RiOa8xisZ0ulzB%2FzlaRnFN0dWAM1rLyZL%2BQpUc9oo2fp616ICqcykrs%2BjPzDfEPLd%2FZA%3D%3D"><span class="wp-block-navigation-item__label">Testing5</span></a>

Clicking the link opens the page as expected.

I was curious why this data is escaped in the first place, given JSON serialization should be sufficient. It looks like it was added in #19679

Can you figure out these discrepancies? It'd be great to have more test cases too, covering the various different URLs we need to support.

@kozer
Copy link
Contributor Author

kozer commented Dec 21, 2022

@danielbachhuber , this should not happen now. It's fixed. The problem was that we were testing the same thing, differently on the same component.

More specifically, textControl was calling setAttributes to set the URL (and this wasn't encoding the URL), but adding the URL the way I did in the pr was calling updateAttributes (which was escaping the URL).

Given that I have to propose the following:

  • replace all direct calls to setAttributes to updateAttributes
  • update the function name to processAttributes or something, to make it more apparent that this has to be called every time we want to send some props outside of the component. By doing that, we will make sure that all the data that will "go out" from this component, will be processed as needed.

@danielbachhuber
Copy link
Member

@kozer Can you make sure there are test cases that cover the before and after of your changes, as well as the variety of links we want to support?

@kozer
Copy link
Contributor Author

kozer commented Dec 23, 2022

@adamziel @gziolo I update the pr trying to have a fallback for users that have stored URLs that are not encoded. I also updated the tests.
If you think this is enough, let's go ahead and merge it.
I'll be off the next week, however, @sejas is aware of the ticket if anything comes up.

@kozer kozer requested review from danielbachhuber and removed request for tellthemachines January 3, 2023 11:03
return rawurldecode($url);
}
return $url;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we consolidate these two functions into one function called block_core_navigation_link_maybe_urldecode()?

For better or for worse, maybe_* functions are a common pattern in WordPress core: https://developer.wordpress.org/?s=maybe

It seems like there are some PHPCS issues that need to be fixed too.

Copy link
Contributor Author

@kozer kozer Jan 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer having them as two separate functions, as it's clear what each one of them is doing, and we can reuse is_url_encoded for testing encoded URLs elsewhere if we need to instead of tying those two functions together and producing a "side effect" if the URL is encoded.

However, I'll rename the urldecode_once to maybe_urldecode to match the common pattern in WordPress core.

Thanks for pointing this out!

It seems like there are some PHPCS issues that need to be fixed too.

Can you elaborate a bit more on this? I don't see any PHPCS issues locally for those two functions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kozer Given this will eventually be included in WordPress core, I'd like them consolidated to one function to match the pattern already established in WordPress core.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@danielbachhuber Doesn't the maybe_urldecode renaming establish that pattern?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kozer If it's only one block_core_navigation_link_maybe_urldecode() function, then yes 😊

In the context of a large open source project, we shouldn't introduce two functions if we only need one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@danielbachhuber I made the change as requested!
A side note:
I'm not sure if the phpcs errors you mentioned are related to tabs vs spaces. Neither vscode or neovim show any warnings/errors to me.
Are tabs the way to go in PHP-related files?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if the phpcs errors you mentioned are related to tabs vs spaces. Neither vscode or neovim show any warnings/errors to me.
Are tabs the way to go in PHP-related files?

@kozer Here's the failing test: https://github.com/WordPress/gutenberg/actions/runs/3830602130/jobs/6518686126

The failing test runs npm run lint:php:

image

You can run npm run lint:php in your local environment to see all of the same errors.

I use this VS Code extension to integrate PHPCS:

Name: PHP Sniffer & Beautifier
Id: valeryanm.vscode-phpsab
Description: PHP Sniffer & Beautifier for Visual Studio Code
Version: 0.0.15
Publisher: Samuel Hilson
VS Marketplace Link: https://marketplace.visualstudio.com/items?itemName=ValeryanM.vscode-phpsab

You can see all of WordPress' PHP coding standards here: https://developer.wordpress.org/coding-standards/wordpress-coding-standards/php/

@@ -89,7 +89,7 @@ export const updateAttributes = (

setAttributes( {
// Passed `url` may already be encoded. To prevent double encoding, decodeURI is executed to revert to the original string.
...( newUrl && { url: encodeURI( safeDecodeURI( newUrl ) ) } ),
...{ url: newUrl ? encodeURI( safeDecodeURI( newUrl ) ) : newUrl },
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still not super clear why it was necessary to encodeURI() here in the first place. It seems like https://example.com?s=<> is JSON-encoded just fine:

>> var test = {};
undefined
>> test.url = 'https://example.com?s=<>';
"https://example.com/?s=<>"
>> JSON.stringify(test);
"{\"url\":\"https://example.com/?s=<>\"}"

Maybe because the <> gets stripped out by kses? Can you debug and document why #19679 was necessary in the first place? If it's not necessary, maybe we can remove it entirely.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@danielbachhuber you were right. The reason that this was stripped out is because of esc_url function that uses kses under the hood.
So I think this is the way to go here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason that this was stripped out is because of esc_url function that uses kses under the hood.

@kozer For posterity, can you provide a full step-by-step documentation of what's going on?

Copy link
Contributor Author

@kozer kozer Jan 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For sure @danielbachhuber! So, the process is the following:

  • User enters a new URL via the editor.
  • Before encodeURI introduced the URL was stored as is.
  • After saving, when someone navigates to the page with the inserted URL, the URL is passed in the render_block_core_navigation_link function.
  • When rendered, the URL is passed through esc_url. esc_url uses the preg_replace function that seems to discard < and > characters:
	$url = preg_replace( '|[^a-z0-9-~+_.?#=!&;,/:%@$\|*\'()\[\]\\x80-\\xff]|i', '', $url );
  • Using encodeURI in a URL that has those characters, (eg: encodeURI('http:example.com?s=<>') encodes it, and produce an encoded URL (in the above example the URL is encoded to http:example.com?s=%3C%3E, and so, esc_url no longer strips out those characters.

This is the reason why encodeURI was introduced originally in #19679, and produce the bug we are now facing.

I also put those steps in the description of this pr. Given that, I assume this is ok to be merged.

@kozer kozer requested review from danielbachhuber and removed request for WunderBart January 5, 2023 08:54
@danielbachhuber danielbachhuber changed the title fix Navigation Block: Special characters in Links Are Improperly decoded Navigation Block: Properly decode URL-encoded links Jan 5, 2023
Copy link
Member

@danielbachhuber danielbachhuber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your persistence on this, @kozer !

I appreciate that you updated the PR description with "Some context on why this bug was introduced" too.

@danielbachhuber danielbachhuber added this to the Gutenberg 15.0 milestone Jan 5, 2023
@danielbachhuber danielbachhuber merged commit 622b829 into WordPress:trunk Jan 5, 2023
@github-actions
Copy link

github-actions bot commented Jan 5, 2023

Congratulations on your first merged pull request, @kozer! We'd like to credit you for your contribution in the post announcing the next WordPress release, but we can't find a WordPress.org profile associated with your GitHub account. When you have a moment, visit the following URL and click "link your GitHub account" under "GitHub Username" to link your accounts:

https://profiles.wordpress.org/me/profile/edit/

And if you don't have a WordPress.org account, you can create one on this page:

https://login.wordpress.org/register

Kudos!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Block] Navigation Link Affects the Navigation Link Block First-time Contributor Pull request opened by a first-time contributor to Gutenberg repository [Type] Bug An existing feature does not function as intended
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants