Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add allowedCharactersInPath whitelist #39

Open
wants to merge 8 commits into
base: develop
Choose a base branch
from

Conversation

zhongliang02
Copy link
Contributor

@zhongliang02 zhongliang02 commented Jan 14, 2025

Context

In anticipation of future open redirects, we want to increase strictness of URL validation.

Approach

We implement a default pathname character whitelist where the pathname can only consist of [A-Za-z0-9.-_/].
Users are then given the option to adjust or remove this whitelist via an option allowedCharactersInPath.

We also ensure that the pathname can only start with a single slash to prevent schema-relative URLs.

Risks

This may break existing links that contain special characters. Especially mailto links since they contain @.
However, a quick scan of the repos indicate that no one will be affected by this change.
Regardless, the upgrade advisory will include a warning on breaking links with special characters.

Additional changes

The behaviour of parsing invalid URLs with the schema has been changed to throw a ZodError instead of a URLValidationError.
This is in line with expected behaviour from Zod schemas, allowing the usage of .catch seamlessly instead of having to wrap the parse in a try-catch.

On the other hand, the behaviour of the URLValidator class has not been changed, but the error message has been changed.

This PR also refactors some of the code to be more organised.

Additionally some dependencies have been updated with pnpm audit --fix

Testing

Tests have been updated to reflect the new behaviours.
Tests have also been added for the pathname whitelist.
Additionally, the care360 open redirect PoC has been added as a test case.

@zhongliang02 zhongliang02 requested a review from a team January 14, 2025 23:22
@zhongliang02 zhongliang02 marked this pull request as ready for review January 14, 2025 23:45
Copy link

@spaceraccoon spaceraccoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some nits and a quick Q on the allowedchars design!

packages/validators/src/url/options.ts Outdated Show resolved Hide resolved
packages/validators/src/url/options.ts Outdated Show resolved Hide resolved
Comment on lines 25 to 37
const schema = z.string().refine(
(str: string) => {
for (const char of str) {
if (!allowed.has(char)) {
return false
}
}
return true
},
{
message: `Every character must be in ${allowedChars}`,
},
)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're doing this, would it make sense to just use .regex instead? The downside is that we will need to have allowedRegex instead and accept a Regex arg instead of string. Not the worst thing in the word, but I'd rather have a regex in than construct it dynamically from a string. Would be a bit more conformant IMO. Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was mulling over this as well. We did initially discuss doing a z.regex() but I thought the interface felt too weird for the user.

allowedPathCharsRegex: /[a-zA-Z0-9]/

What would be the intended behaviour if the user supplies

allowedPathCharsRegex: /[a-zA-Z0-9]/g
allowedPathCharsRegex: /abcdef/
allowedPathCharsRegex: /[a-zA-Z0-9]*/g

I think it is not intuitive for the user that each character will be matched against the regex.

Copy link
Contributor Author

@zhongliang02 zhongliang02 Jan 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A more usable interface could be like

validPathRegex: /^[a-zA-Z0-9.-_/]*$/

@zhongliang02
Copy link
Contributor Author

preemptively added validPathRegex so we can visually compare between the two

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants