Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FR: Run urlchecker::url_update() to fix link rot? #447

Closed
IndrajeetPatil opened this issue Nov 10, 2022 · 5 comments
Closed

FR: Run urlchecker::url_update() to fix link rot? #447

IndrajeetPatil opened this issue Nov 10, 2022 · 5 comments

Comments

@IndrajeetPatil
Copy link
Contributor

This is one of the most common sources of a NOTE in CRAN's incoming checks. It'd be nice if {precommit} can detect and fix it using {urlchecker}.

That said, this comes at the cost of two additional dependencies, and I am not sure if that's a dealbreaker for you:

setdiff(
  tools::package_dependencies("urlchecker", recursive = TRUE)[[1]],
  tools::package_dependencies("precommit", recursive = TRUE)[[1]]
)
#> [1] "curl" "xml2"

Created on 2022-11-10 with reprex v2.0.2

@lorenzwalthert
Copy link
Owner

The dependencies: I don't know. But by eliminating {tibble} from {styler}, we recently reduced recurse dependencies quite a bit. Also, most dependencies in {precommit} are not dependencies to install the package, but to run the hooks. And these are mostly in Suggest: so I guess the diff is only {curl}.

But I think more about additional check time during commit. I think URL checking is something best off-loaded to the CRAN submission process (or CI, but it seems pre-commit hook can be skipped on CI, but not ran on CI and skipped locally, according to the docs if I am not mistaken).

@IndrajeetPatil
Copy link
Contributor Author

I think URL checking is something best off-loaded to the CRAN submission process

But note that this is also something that we should be attending to even if the package is not on CRAN.

For example, if we update the package once a year, and if certain URLs rot after a release, it will take a whole year before CRAN machines catch this. In other words, unless and until you update the package on CRAN and CRAN machines detect the link rot, you will never discover that users have been experiencing problem accessing certain links. And not everyone is comfortable enough or cares enough to create issues asking about dead URLs.

So this check can improve UX while reading the docs. Makes sense?

@lorenzwalthert
Copy link
Owner

For example, if we update the package once a year, and if certain URLs rot after a release, it will take a whole year before CRAN machines catch this.

I don't see why that's big enough problem. Pypi or other hosting plattforms don't even offer link rot checks. Also, the link rot is not something that a new commit introduces, it's an external change. A pre-commit hook would incentivise people to start fixing broken links in the middle of doing something else and convolute the git history.

You have the following options (in my order of preference):

  • offloaded to CI. E.g. an action that runs once a month on the default branch, not triggered on push. Also see Documentation check r-lib/actions#267.
  • implement a local hook with either language: r or language: system which is not hard. Let me know if you need guidance.
  • implement your own hook repo that hosts such a hook.

But I don't want to implement that check in this repo.

@IndrajeetPatil
Copy link
Contributor Author

you will never discover that users have been experiencing problem accessing certain links. And not everyone is comfortable enough or cares enough to create issues asking about dead URLs.

So this check can improve UX while reading the docs.

You left out the most important part of my comment.

But I don't want to implement that check in this repo.

Okay, this was just a suggestion, since I thought it'd improve the framework.

And this was not supposed to be for me - I already have cron jobs set up for checking lint rot via GHA workflows. So, yes, this is possible to do outside precommit.

@IndrajeetPatil IndrajeetPatil closed this as not planned Won't fix, can't repro, duplicate, stale Dec 6, 2022
@lorenzwalthert
Copy link
Owner

lorenzwalthert commented Dec 6, 2022

Okay, this was just a suggestion, since I thought it'd improve the framework.

Yes. I am always happy to discuss suggestions, so thanks for bringing it up. Thanks for understanding that I don't want to include it after thinking it through with you.

Also, I did not address your UX comment because for me, that is implicit to the issue, as it is the end goal of link checking. I approve of link checking, I just don't think pre-commit hooks are the right tool for it, mostly because it slows down the commit flow, introduces additional dependencies and is unrelated to the changes that are staged for commit.

I already have cron jobs set up for checking lint rot via GHA workflows

ok, great to see that. I'd be happy to include this workflow in {styler} and {precommit} (maybe once a month is enough to run it). If you want, you can send a PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants