-
Notifications
You must be signed in to change notification settings - Fork 212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tracking Issue: Full R Support #1543
Comments
This looks good! Thanks for the issue :) |
Would be awesome! Any chance it could also support installing R packages from (public) Github repo's? (i.e not on CRAN or Bioconductor). |
At least initially, no. But it would include everything on R-Universe, where you can find the vast majority of packages - and creating a universe for your packages is quite simple. I guess @wolfv would know how much more complicated it is to install directly from a Github source, and whether it's feasible. Currently it uses the the JSON from the r-universe API (e.g. https://stan-dev.r-universe.dev/api/packages/). I think, to support GH packages, a new parser would need to be written for parsing I hope to make a write-up of all the R goodies quite soon! |
Yeah, there is no technical limitation why it wouldn't work directly from a Github package. Right now, we just use the easy to parse JSON but you could also write a little parser for the R native file - or write the recipe yourself! :) |
Just to offer some insight, the DESCRPTION file is just a Debian Control File, except it's encoded in ASCII and does not support comments. In theory it should be as easy to use a parser for these, such as debian-control. All info about the file (and R packages) can be found on the manual |
This is awesome. Thanks for all the hard work to support R users. A few questions:
What is Also, heads up that the name could cause some potential confusion in the R community. R-Forge has already existed for years (it's an R-specific source control system). It's not as popular now that GitHub exists, but saying "download the package from r-forge" could be ambiguous. For the mass packaging, have you coordinated with the conda-forge R maintainers (@conda-forge/r)? They do a lot of work to maintain thousands of R packages. It's a lot of work to keep up with all the conda-forge migrations. |
@jdblischak Thanks! Really good question, thanks for asking! Most of this was implicit knowledge, but seeing as this is issue is getting a bit of traction, I've updated that section now. I haven't been in touch with the conda-forge R maintainers except from trying to create recipes for packages - and all my attempts failed, they were super helpful, but overall I can agree, it's hard work. It also seems that the packages are not updated very regularly (https://anaconda.org/r/repo). For reference see the edit of the initial post. With rattler-build and the R-universe API, it's been super easy to create package recipes, and I think it'll be feasible to have automated packaging without too many edge cases (but @wolfv will know much better than me whether that's the case - I'm a dreamer 😉). We also talked about automatic I didn't know about the other R-Forge, good to know about. I think @wolfv just made the name as a way to have language-specific forges (e.g. also see rust-forge. |
Hey @jdblischak indeed thanks! I was working on more automatic recipe generation for R recipes from inside I think for certain ecosystems (such as potentially R) it makes sense to maintain them in a more centralized manner than The "forges" use the As far as recipe-generation goes I would like to come up with a generalized "patching" functionality so that the bulk of the recipe is generated, and then enhanced with patches (e.g. to add system-dependencies). |
@wolfv I also think it might be worth getting in touch with the conda-forge R maintainers at some point, but I reckon it's probably better you than me - I simply don't know enough about the packaging process. |
@roaldarbol Thanks! It is much clearer now.
That is the "r" channel, parts of the "defaults" channel provided by the Anaconda developers. It has nothing to do with the community channel conda-forge.
@wolfv I agree there would be advantages to a centralized mono-repo approach. This has been discussed before, eg in bgruening/conda_r_skeleton_helper#48 But this would be a huge change, both technically and socially. On the technical side, we'd have to figure out how to apply the conda-forge migrations to this monorepo. On the social side, we'd have to document that R users no longer submit new recipes to staged-recipes or open an Issue on an individual feedstock, but instead must direct all their activities to the new mono-repo.
I also worry about duplication of effort. In addition to the existing CRAN skeleton for conda-build, grayskull now also supports R recipes. With the addition of Here's an old PR to the CRAN skeleton that attempted to directly parse the
That's how NixOS builds its R packages. The recipes are auto-generated from a script, and then system requirements and other patches are added afterwards: https://github.com/NixOS/nixpkgs/blob/master/pkgs/development/r-modules/default.nix |
Ah, my bad! No problem on that front then. 😊
@wolfv Is the plan that |
@wolfv I've been thinking about this more, and I think the key is your use of most. Some R packages require much more maintenance. I'm thinking of packages like r-arrow and r-tiledb that require careful pinning to the correct version of their corresponding C++ library. Maintainers will want to be notified when their package has an update PR, and they will not want to give up their write-access (full disclosure: I am a maintainer of r-tiledb). But with the mono-repo approach, there is no way (that I am aware of) to only receive notifications when certain files are touched by a PR, or only grant write-access to specific files within a PR. As a first pass, I would recommend starting the mono-repo with all the conda-forge R feedstocks that only have |
I'll just try to parse out which separate issues I see in this conversation so we can create separate issues for them in the appropriate location:
Can I just say what I'm realising: Packaging is hard. Y'all are doing an amazing job. |
my 2 cents as a daily |
Correct. It is purely to inform end users. It is completely optional/voluntary, and R itself never parses it.
Correct. They used to maintain an explicit database with the system requirements mappings, sysreqsdb, but from the README of r-system-requirements, apparently that manual approach was too cumbersome.
Hard to say, especially build-time versus run-time. Looking at the manual for the R package {pak}, which uses r-system-requirements, it explicitly states that it doesn't attempt to distinguish between build-time and run-time. I attempted to do a quick analysis with their function sysreqs <- pak::sysreqs_db_list(sysreqs_platform = "ubuntu-22.04")
str(subset(sysreqs, name == "curl"))
## Classes ‘tbl’ and 'data.frame': 0 obs. of 5 variables:
## $ name : chr
## $ patterns : list()
## $ packages : list()
## $ pre_install : list()
## $ post_install: list()
str(subset(sysreqs, name == "xml2"))
## Classes ‘tbl’ and 'data.frame': 0 obs. of 5 variables:
## $ name : chr
## $ patterns : list()
## $ packages : list()
## $ pre_install : list()
## $ post_install: list()
str(subset(sysreqs, name == "rmarkdown"))
## Classes ‘tbl’ and 'data.frame': 0 obs. of 5 variables:
## $ name : chr
## $ patterns : list()
## $ packages : list()
## $ pre_install : list()
## $ post_install: list()
str(subset(sysreqs, name == "chrome"))
## Classes ‘tbl’ and 'data.frame': 1 obs. of 5 variables:
## $ name : chr "chrome"
## $ patterns :List of 1
## ..$ : chr "\\bchrome\\b"
## $ packages :List of 1
## ..$ : NULL
## $ pre_install :List of 1
## ..$ : chr "[ $(which google-chrome) ] || apt-get install -y gnupg curl" "[ $(which google-chrome) ] || curl -fsSL -o /tmp/google-chrome.deb https://dl.google.com/linux/direct/google-ch"| __truncated__ "[ $(which google-chrome) ] || DEBIAN_FRONTEND='noninteractive' apt-get install -y /tmp/google-chrome.deb"
## $ post_install:List of 1
## ..$ : chr "rm -f /tmp/google-chrome.deb" Anyways, one useful metric is how many packages require compilation. This will give you a sense of how many are trivial to build binaries for. You'll also want to investigate any packages with restrictive licenses. x <- as.data.frame(available.packages())
table(x$NeedsCompilation)
##
## no yes
## 16267 4760
table(x$License_restricts_use == "yes")
##
## FALSE TRUE
## 9 3 You can also look at how many R packages that nixOS patches for a rough estimate of the number of packages that have system requirements, as well as those that they have marked as broken: |
Just wanna say that It is solving all python package dependency issues that were a nightmare for developers. I really hope I'd like to thank all the contributors of this project, |
Not sure it'll be useful, but hopefully can be used at least as a reference, but I wrote a DESCRIPTION file parser here at andystopia/cran-work. A crate in the repo tests the parser against the most recent version of every package's DESCRIPTION file (21821 files). Result: 0 errors. |
I've worked on this a little more, and my repo, andystopia/cran-work, can now generate rattler-build files from CRAN & Bioconductor DESCRIPTION files directly, including historical versions of packages! The following command will generate an r-matrix directory with a build yaml contained within, and should be sufficient to build the latest Matrix package from the CRAN. cargo run --release -p description-to-rattler -- cran recipe Matrix --export You can leave off --export, if you just want the definition printed to the stdout. |
Problem description
This is a list of issues which need to be solved to have complete R support - not all are directly issues within pixi, but rather obstacles to a smooth pixi-based workflow.
Packaging
conda-forge
or r-forgeconda-forge
with ther-
prefix (e.g.r-dplyr
), so it's already ready to be used. However, not all packages are there and the recipes are not super easy to create or maintain.rattler-build
is the new alternative to conda-build that is based on a new recipe format, and there's been put quite a bit of work into covering edge-cases. Packages uploaded withrattler-build
are currently placed in r-forge. R-Universe offers a really nice API that's easy to parse, so the hope here is to have automated packaging of all the packages that are available on R-Universe. Once/if this can be done, I think the idea is to create it as an actual conda channel.GUI
conda-forge
Workflow
conda-forge
orr-forge
roaldarbol/rpix#2. Upload the rpix package to r-forge. Afterwards I'll implement features as needed.pixi init
to start with an template. #786copier
) - done, here's the templateDocs
The text was updated successfully, but these errors were encountered: