Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI improvements - Pyodide/WASM #18

Open
trallard opened this issue Dec 16, 2022 · 9 comments
Open

CI improvements - Pyodide/WASM #18

trallard opened this issue Dec 16, 2022 · 9 comments
Assignees
Milestone

Comments

@trallard
Copy link
Member

trallard commented Dec 16, 2022

📝 Summary

Expand the CI support for cross-compiling to Pyodide/WebAssembly to at least five projects.

🚀 Tasks / Deliverables

TBD

📅 Estimated completion

24 months milestone

📋 Additional information

Status

Tip

This table has been brought over from pyodide/pyodide#3049 (comment)

Package name Out-of-tree WASM builds Anaconda.org scheduled uploads
NumPy numpy/numpy#25894, numpy/numpy#26564, numpy/numpy#26570 numpy/numpy#26134, numpy/numpy#27353
PyWavelets PyWavelets/pywt#701, PyWavelets/pywt#744 PyWavelets/pywt#710
pandas pandas-dev/pandas#57896 pandas-dev/pandas#58647
awkward and awkward-cpp scikit-hep/awkward#2062 (not by me) In progress at scikit-hep/awkward#3270
scikit-learn ✅ (improvement via scikit-learn/scikit-learn#29791 in progress) Planned
scikit-image ✅ (setup: scikit-image/scikit-image#7350, improvement: scikit-image/scikit-image#7525) In progress at scikit-image/scikit-image#7440
statsmodels ✅ (setup: statsmodels/statsmodels#9270, improvement: statsmodels/statsmodels#9343) MacPython/statsmodels-wheels#161
Zarr zarr-developers/zarr-python#1903, needs pyodide/pyodide#4817 to be released Planned
numcodecs zarr-developers/numcodecs#529, ready for review Planned
SciPy Planned Planned
SymPy sympy/sympy#27183 sympy/sympy#27186 (implemented by a maintainer), python-flint (dependency of SymPy) WASM builds left – discussion underway in flintlib/python-flint#234
Matplotlib matplotlib/matplotlib#27870, being tracked in matplotlib/matplotlib#29093 (not implemented by me) Planned in matplotlib/matplotlib#29093
h5py and libhdf5 h5py/h5py#2397 Planned
PyTables Planned Planned
@rgommers
Copy link
Member

Aiming to meet this deliverable within the next 2-4 weeks. Several projects have support (NumPy, PyWavelets, Pandas, scikit-learn), others are in the pipeline (scikit-image, Zarr, Awkward, hopefully also Matplotlib at least). A few others started but on hold due to higher priority items.

Meeting the deliverable won't be the end of it, but we should switch to deploying working interactive docs for a few more projects first, to accelerate the feedback cycle.

@rgommers
Copy link
Member

We're getting there! Thanks for adding the detailed issue tracker @agriyakhetarpal

@agriyakhetarpal
Copy link
Member

Pyodide's alpha releases for 0.27 are now up, @rgommers – should we now look at zarr-developers/zarr-python#1903 again or wait a bit until we have the stable release a short while after?

@rgommers
Copy link
Member

should we now look at zarr-developers/zarr-python#1903 again or wait a bit until we have the stable release a short while after?

The action there is to make async tests for Zarr v3 work, which doesn't depend directly on that PR but (if I understand correctly) is infra work within Pyodide. If there's nothing higher on your prio list, trying to understand that in more detail and moving it forward would be useful I think.

@agriyakhetarpal
Copy link
Member

Initially, this was slightly difficult back when I started with the Pyodide ecosystem, but we've got statsmodels's support backported via statsmodels/statsmodels#9365 so that it could get fast-tracked for inclusion in a new v0.14.4 release with no other changes today :) Both last month and this month involved and will involve a bit of travel and conferences respectively, so we should be able to close the "official" target of five projects down in early November (including Zarr, from the above discussion).

Here is a bit of extra context for any other potential readers besides Ralf and me:

  • SciPy's in-tree updates have resolved some downstream issues across scikit-learn, scikit-image, and statsmodels; they are always beneficial to do to get close to upstream SciPy. It could get out-of-tree CI support as its upstream FORTRAN 77 rewrites proceed and when the number of patches is sized down in both size (number of SLOC) and intrusiveness (we could consider both metrics related in this context). Hence, while the table mentions that out-of-tree CI builds are planned, in-tree updates coupled with the occasional PRs with patches that get to go into SciPy upstream is a reasonable way to add support.
  • In a prior discussion, we planned against including h5py's out-of-tree builds because of in-browser usability reasons.

Two questions on the above:

  1. Is it worth spending time occasionally backporting SciPy's upstream rewrites in Pyodide downstream and un-skipping WASM tests as a result? I feel the answer should be "yes", since it helps us know reasonably well how well SciPy works and helps reduce turnaround time for in-tree updates (which come after with SciPy's PyPI releases – twice a year). Here, I don't have a set target in principle, but "occasionally" could refer to "anything more than twice a year". They would be similar to how id_dist was Cythonised (patched in Pyodide now) and how LBFGSB was rewritten (not yet patched). One way to evaluate which rewrites to backport would be to see which and how many tests a particular rewrite allows us to un-skip, since rewrites would be included in the next SciPy release anyway.
  2. If the emscripten-forge ecosystem is able to build updated versions for libhdf5 and h5py sometime down the line (they have something in progress right now), we can look into including the Emscripten-compiled libhdf5.a in the cross-build environment to make it available for out-of-tree linkage, similar to how NumPy includes libnpymath.a and the relevant header files in xbuildenv/site-packages-extras/? And when we unvendor packages' (and libraries') recipes, their updates will become faster because they will get decoupled with Pyodide.

Decoupling recipes in the medium term would make us have to bother a bit less with the first question, too: the rewrites get included in subsequent SciPy releases, which are not in sync with the Pyodide releases, since the timelines have always been and would continue to be different, so some PR that is going to benefit, say, SciPy v1.16 users would be nice to backport to SciPy v1.15 in Pyodide if Pyodide has an upcoming release (i.e., before SciPy's v1.16's upcoming release). That said, there are other reasons besides the difference in release timelines for why the act of porting these rewrites is useful, I believe, which are covered in the question.

@agriyakhetarpal
Copy link
Member

SymPy added as a potential target as discussed on 11/10/2024.

@rgommers
Copy link
Member

we should be able to close the "official" target of five projects down in early November (including Zarr, from the above discussion).

Great to see that!

Is it worth spending time occasionally backporting SciPy's upstream rewrites in Pyodide downstream and un-skipping WASM tests as a result? I feel the answer should be "yes",

I'd say probably not, since this is mostly extra work (and not just an hour or less) that is anyway going to land in Pyodide. I'd prefer to see time spent on more structural improvements.

2. we can look into including the Emscripten-compiled libhdf5.a in the cross-build environment to make it available for out-of-tree linkage, similar to how NumPy includes libnpymath.a and the relevant header files in xbuildenv/site-packages-extras/?

I really want to get rid of libnpymath.a - shared libraries that cross package boundaries are a really bad idea in Python wheels, and we've had a lot of trouble with it over the years. So my first inclination here is to say that this probably isn't a step in the right direction.

@rgommers
Copy link
Member

It seems like we've met the deliverables here. There's a few more PRs that look close (e.g. scikit-image wheels PR looks like the code is written, it's just waiting on Pyodide 0.27) and more improvements are always nice, but for the record let's declare victory here:) Issue can stay open for tracking purposes.

@agriyakhetarpal
Copy link
Member

more improvements are always nice, but for the record let's declare victory here:)

Yay! Here's to victory! Yes, I'd keep the issue open, too, since there are a few niceties that I'd like to clean up with, such as pyodide/pyodide-actions#12, which is a nice-to-have but not urgent at all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: 🏗 In progress
Development

No branches or pull requests

3 participants