Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query strings break redirects #23

Open
fredrikekelund opened this issue Sep 18, 2019 · 7 comments
Open

Query strings break redirects #23

fredrikekelund opened this issue Sep 18, 2019 · 7 comments

Comments

@fredrikekelund
Copy link

I don't know whether this is intentional or not, but we discovered that redirects aren't applied when URL's are requested with a query string. This caused issues for a client of ours when visitors were clicking a link on Facebook and Facebook automatically appended ?fbclid=foobar to the URL.

If there was a way to somehow specify whether query strings should be disregarded or not, that could be very useful!

@boutell
Copy link
Contributor

boutell commented Sep 23, 2019

This is intentional — a URL with a query string isn't the same URL. I agree it would be a nice feature to be able to specify whether the query string is part of the match. I've tagged this as "contributions welcome" for now.

@shaunhurley
Copy link
Contributor

Hi Tom, I just ran across this issue/topic as well, thought I'd drop a note here.

Your point on a URL with a query string not being the same URL is well taken, but in reality I'm hard pressed to come up with a scenario where you would want to include the querystring element as part of a redirect.

After playing with this I can make a querystring exclusion work by doing a split('?') on the req.url and passing the first array element into the self.find() call.

Unfortunately this is an all-or-nothing approach, since it will act on every req invocation, which presumably is not desirable since it is the inverse of the current standard i.e. you could never match on a redirect that does includes a querystring.

As it stands it appears impossible (used loosely :) to configure a toggle on a per-redirect-definition basis since that becomes a chicken-and-egg problem, where the redirect config can't be accessed until a matching redirect is found, but a matching redirect won't ever be found since the querystring is still in play.

Given this, the only thing that seems relatively logical would be to include a 'top' configuration toggle, to exclude querystrings from matching, so it will apply to all or no redirects.

Is that something worth pursing i.e. as a contribution, or it is unlikely to be accepted?

Cheers,

Shaun

@boutell
Copy link
Contributor

boutell commented Feb 8, 2021 via email

@shaunhurley
Copy link
Contributor

Hmmm. How about a query that matches one or more redirects because it matches on the part before the ?, but then giving precedence to a full match (with query string) if one was present, before looking at the other results to see if any of them have the "ignore the query string" flag? It seems like that would be the best of both worlds.

The basic premise sounds good! I spent a little time poking around and, of course, ran into a roadblock :) Assuming that we want to keep all of this within the apostrophe-redirects module, the only way I can currently see to achieve this would be to have the return self.find(req, { slug: 'redirect-' + slug }, {title: 1, slug: 1, urlType: 1, pageId: 1, type: 1, externalUrl: 1, redirectSlug: 1, statusCode: 1, _newPage: 1}) invocation passed with a regex instead of the existing generated { slug: 'redirect-' + slug } slug.

I spent a little time spelunking trying to find the origin of find() in order figure out if the criteria passed in the call would accept a RegEx, but got lost, so went empirical and tried it instead (based on a few other things I'd seen in some of the code base).

Fortunately it appears that a RegEx is a legitimate criteria :) That gave me the following:

let pathOnly = req.url.split('?')[0];
let redirectRegEx = new RegExp(`redirect-\/${pathOnly}.*`).toString();
return self.find(req, { slug: redirectRegex }, {title: 1, slug: 1, urlType: 1, pageId: 1, type: 1, externalUrl: 1, redirectSlug: 1,   statusCode: 1, _newPage: 1})

That successfully finds both a bare matching redirect, as well as a matching redirect that has a querystring, however it appears that the find() invocation will only return a single redirect object (seeming to be the most recently edited one) not all matching redirects as an array or similar. Obviously that makes it hard to do any further logic.

Are you aware if there is an options/means to get multiple results from the find() function? Or am I completely on the wrong path here?

The other alternative that I can see would be to do an initial find with the full querystring object and, if there is no result, segue into a second invocation of find using just the base URL path, rather than trying to get a set of results to iterate through...

@boutell
Copy link
Contributor

boutell commented Feb 9, 2021 via email

@shaunhurley
Copy link
Contributor

Ahh, that was what I was missing, contextual knowledge gap on what find was actually returning! Sweet! Problem solved, working (I think... initial tests look good, I need to do more extensive testing) solution implemented.

The only remaining item I have is your comment on using regexpQuote. I can't find any documentation on/for apos.utils.regexpQuote in the apostrophe-utils documentation though digging into the code base did send me to the npm package regexp-quote however that page is also... less than informative :) I'll tinker with it and figure it out...

@boutell
Copy link
Contributor

boutell commented Feb 9, 2021 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants