Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support incoming webmentions #841

Merged
merged 9 commits into from
Aug 5, 2024
Merged

Support incoming webmentions #841

merged 9 commits into from
Aug 5, 2024

Conversation

onli
Copy link
Member

@onli onli commented Jun 5, 2024

This PR adds webmention support. It enables a webmention endpoint in our comment.php, links to it in 2k11's index.tpl and then stores the webmention as either a pingback or a trackback in the DB.

Webmentions are a kind of linkback, but there are different kind of webmentions in the the spec. The two that seem relevant to me are: First simple mentions, which I think match Pingbacks, where the blog owner is simply informed that someone elsewhere linked to a blog article. The second type are replies, which have additional markup and are like a comment just made on a different site, those I think match trackbacks with their excerpts quite well.

Note that this code does not follow https://indieweb.org/comments#How_to_display completely. Webmentions rely on a bunch of microformat2 markup on the source site, that s9y then has to fetch and interpret. There are some testcases at https://github.com/aaronpk/webmention.io/tree/main/test/data/source.example.org, the only scenario where I enabled trackback mode so far is https://github.com/aaronpk/webmention.io/blob/main/test/data/source.example.org/in-reply-to.html, where the author data is presented as a complete h-card block. All other scenarios should still result in a pingback though.

Additional remarks:

  1. This seems equally attackable as the trackback and pingback scenario from way back, where attackers could use blogs as a traffic intensifier.
  2. By mapping webmentions to trackbacks and pingbacks, we avoid the need to add an additional linkback type to our code, apart from at the endpoint. This also lets the spamblock plugin continue to work.
  3. It's argueable that this implementation relies too much on the spamblock plugin, especially it could be wanted to check $target to be a link to the article for the which the webmention endpoint was triggered. On the other hand, it seems to me like the trackback implementation also does not do that - but otoh again, trackbacks do not necessarily have to fetch the origin, the spamblock plugin does that and the check can be disabled.
  4. I followed the indieweb wiki recommendation to use a MF2 parser, https://github.com/microformats/php-mf2, given that its requirements are just PHP 5.6. But https://github.com/TomboFry/microlight/blob/master/webmention/index.php works with DOMDocument and xpaths to implement webmentions. That could be also an alternative for us (without copying the code though, the license does not match).
  5. If we add this we should also add this to the other index.tpls. I can do that, just wanted to have this PR open first.

What do you think? I'm not a fan of how webmentions could have just been a backwards compatible trackback enhancement but aren't, but with this match to pingbacks and trackbacks for me it seemed reasonable to provide support. If it turns out it is used a lot we could look into sending webmentions ourselves later, if not we can let it rest and there should be no harm done.

onli added 6 commits June 2, 2024 23:49
Adds a codepath that would store them as trackback or pingback, depending on the metadata available on the sending page.
Separate the MF2 parsing into its own function. Since the origin url is fetched only there, we can easier make that optional later.
Only covers one case: A webmention with an in-reply-to that contains the author's h-card.
The spamblock plugin will check for it later
@onli onli requested a review from garvinhicking June 5, 2024 13:55
Copy link
Member

@garvinhicking garvinhicking left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to be careful here because our "bundled-libs" (sadly) is within the docroot. Bundling files in "bin" that could be executed is quite dangerous; normal class files cant be executed, but we need to remove those binaries.

Probably configure our composer.json bindir to something like /tmp outside the project?

Also, there are bin files in extensions like mastermind. We should not bundle those...?

@onli
Copy link
Member Author

onli commented Jun 7, 2024

Hi @garvinhicking, what exactly is the dangerous scenario here? If I open bundled-libs/bin/fetch-mf2 for example it is just output as text in the browser. Is the scenario something like "attacker uploads a PHP file that then uses files in bundled_libs/bin/ to do something"? But wouldn't the attacker just upload the problematic code directly, why would he have to rely on the files in bundled_libs/bin/? Those are just some PHP code anyway.

@garvinhicking
Copy link
Member

I thought I saw .php files in the commot too thst contained non-class code. Need to verify then again.

@onli
Copy link
Member Author

onli commented Jun 8, 2024

.php files would be blockable by the .htaccess :) At least for direct browser access.

But I don't understand yet how this is different for these bundled libs than with all the others?

@onli
Copy link
Member Author

onli commented Aug 5, 2024

@garvinhicking Researching this a bit more, I found no reliable solution for avoiding these files via composer (completely possible I missed something). Instead I added a new .htaccess to bundled-libs, that now blocks all .php files. In my test this worked.

Manual htaccess for now, since I assume the checks of the mechanism for the regular htaccess generation is not needed here.

@onli onli requested a review from garvinhicking August 5, 2024 12:06
@garvinhicking
Copy link
Member

I'm unsure if there are php files in bundled-libs that could be callable via http. I hope not, and am 99% certain... so maybe just try it and see if there come complaints in some plugin?!

@onli
Copy link
Member Author

onli commented Aug 5, 2024

I grepped through additional_plugins/ for 'bundled-libs' and saw no code that not very much looked like a require, or the path creation for one. I think we are good on that side :)

onli added 2 commits August 5, 2024 23:09
Also adds the trackack link-rel, that was so far only in 2k11
@onli onli merged commit e648571 into master Aug 5, 2024
4 checks passed
@onli onli deleted the feature/webmentions branch August 5, 2024 21:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants