-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Git fetch now robust to tree order #342
Conversation
As it stands, code for unpacking of git repo packfiles is sensitive to the order of the trees. This doesn't matter for servers that provide it in the order of deepest roots first, but for those that don't do that, unpacking will provide incorrect results. This problem is the case for repos hosted on Azure DevOps, for example. This change uses the frequency of the hashes throught the packfile to sort the trees from deepest (the root tree) to least deep. This is the most straightforward way I could think of for achieving this. Practically, the function for unpacking the packfile (`unpack_packfile_repo()`) now includes a call to a new function `sort_trees()`.
Start from the root tree, which might not be the first one. Closes r-lib#339.
Thank you, this helped me a lot in fixing the issue! I think your solution was correct, but also a bit more complicated than needed. I also added some tests for this. Thank you again! I will comment here later today, when the nightly build of pak is updated to include this fix. |
OK, this should be in the nightly devel build of pak now. The nightly build is not very smooth nowadays, so don't hesitate to let me know if it still does not work for you. Thanks again for the PR! |
Thanks Gábor! I really appreciate how quickly you were able to give it a look! |
# pak 0.7.1 * pak can now handle the case when `Config/Needs/*` dependencies are requested for package from a repository. * pak uses safer `*printf()` format strings now. # pak 0.7.0 * pak now correctly handles the latest GitHub release with the `@*release` notation (@pawelru, r-lib/pkgdepends#321) * pak now correctly handles having multiple instances of the same package in the metadata, with different R version requirements (#534, #538, r-lib/pkgdepends#331). * `git::` package references work better now for Azure DevOps (@jameslairdsmith, r-lib/pkgdepends#333, r-lib/pkgdepends#342). * pak now does a better job at accepting installed packages, and avoids reinstalling more packages than needed when using a lock file (r-lib/actions#759, r-lib/pkgdepends#338). # pak 0.6.0 * pak now requires R >= 3.5.0. * Many improvements in system requirements support: - New functions: - `pkg_sysreqs()`: calculate system requirements of packages. - `sysreqs_db_list()`, `sysreqs_db_match()`, `sysreqs_db_update()`: query the system requirements database. - `sysreqs_list_system_packages()`, `sysreqs_check_installed()`, `sysreqs_fix_installed()`: query and install missing system packages. - `sysreqs_platforms()`: list supported platforms. - The installation proposal, printed before installation, now includes required and missing system packages, on supported platforms. - New `sysreqs_platform` configuration option to override the auto-detected platform. - Faster, asynchronous system requirements lookup. - pak now does not reinstall system requirements by default, if they are already installed. (You can force a reinstall/upgrade with the `sysreqs_update` configuration option.) * New `gitlab::` package source to install packages from GitLab (r-lib/pkgdepends#315). * pak now correctly parses multiple `git::` packages at once (r-lib/pkgdepends#318). * `git::` package sources now support version 1 of the git protocol. E.g. the Bioconductor git repositories now work: `git::https://git.bioconductor.org/packages/limma` (r-lib/pkgdepends#314). * The `platforms` config parameter now works correctly with `deps::` package sources (#522). * New `include_linkingto` config parameter to always include `LinkingTo` packages in the solution, even for binaries (https://github.com/r-lib/pkgdepends/issues/485). * `pkg_name_check()` now does not include Acromine results, because the web site was unstable. * In `repo_add()` and `repo_resolve()` the `MRAN@` prefix is now deprecated and resolves to PPM, because MRAN will be retired soon. See more at <https://posit.co/blog/migrating-from-mran-to-posit-package-manager/>. * The metadata cache now has `SystemRequirements` information for Bioconductor packages.
As it stands, the pkgdepends code for unpacking git repo packfiles is sensitive to the order of the trees (#339). This doesn't seem to matter for git hosts that provide the trees in the order of deepest roots first but for those that don't the current unpacking code will provide incorrect results. This problem occurs for repos hosted on Azure DevOps, for example, where I've been experiencing it at the Bank of England's self-hosted version of Azure DevOps Server.
I previously couldn't produce a reprex, but I've now set up a publicly accessible project (
https://dev.azure.com/jameslairdsmith/Test
) on Azure DevOps Cloud for testing. I've also cloned the Github version of dplyr, which anyone should be able to clone fromhttps://[email protected]/jameslairdsmith/Test/_git/dplyr
. And so:Created on 2023-10-21 with reprex v2.0.2
Standard output and standard error
Session info
When I try it interactively, I get the following extra output from the child process:
This PR introduces a new
sort_trees()
step intounpack_packfile_repo()
.sort_trees()
uses the frequency of the hashes throughout the packfile to sort the trees from deepest (the root tree) to the least deep. It's the simplest way I could think of to get the desired result with minimal code change. I'm of course open to an easier approach if there is one.When I run the above reprex with my "sort-trees" branch it works as expected.
Created on 2023-10-21 with reprex v2.0.2
Standard output and standard error
Session info