Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build wheels on CI #619

Merged
merged 69 commits into from
Nov 6, 2023

Conversation

grlee77
Copy link
Contributor

@grlee77 grlee77 commented Oct 26, 2023

This MR includes the commits from #617, so that one should be merged first.

The purpose of this MR is to build and test wheels on RAPIDS CI infrastructure. Previously, cuCIM was using a custom manual wheel building process based on running:

./run build_python_package

which used dockcross/cibuildwheel. We would then manually upload the resulting wheel to PyPI using twine.

The changes in this MR are to enable automated wheel builds for future RAPIDS releases in a similar way to other RAPIDS projects.

Misc issues/questions for discussion

  • The build_wheels.sh script installs the YASM and the openslide libraries via system package managers. This is currently hard-coded in that script rather than using the dependencies.yaml mechanism being used elsewhere. Is this reasonable or is there a better way?
  • Following cuML's example, I added a separate ci/release/apply_wheel_modifications.sh script that updates the pyproject.toml as needed (to fixup version string and the CUDA version strings in the cucim package name and CuPy dependency)
  • I updated the cuCIM test suite to make a handful of tests that relied on imagecodecs to be skipped if it is not installed. This is because I did not find a pre-compiled wheel for it on aarch64 and when it tried to compile from source instead, it failed due to various missing system packages needed for the compilation.
  • Unlike most RAPIDS projects, which seem to use scikit-build, this project performs a separate CMake build via run build_local commands and copies the C++ libraries into the Python source tree. The pyproject.toml then just does a standard pure-python build, but will package the previously compiled libraries along with the other files.
  • The wheel tests are currently taking approximately 35-50 minutes to run. It should be possible to reduce that time by an order of magnitude if the CuPy kernels are cached across runs. Does any RAPIDS projects cache the compiled CuPy kernels in this way? (cached kernels end up in ~/.cupy/kernel_cache/ by default, although an alternative location can be specified via CUPY_CACHE_DIR)

@grlee77 grlee77 added feature request New feature or request non-breaking Introduces a non-breaking change labels Oct 26, 2023
@grlee77 grlee77 added this to the v23.12.00 milestone Oct 26, 2023
@grlee77 grlee77 requested review from a team as code owners October 26, 2023 17:43
@grlee77 grlee77 force-pushed the grelee/build-wheel-on-ci branch from 190b779 to 28fd183 Compare October 27, 2023 01:24
@grlee77 grlee77 requested a review from a team as a code owner October 27, 2023 02:15
@grlee77 grlee77 force-pushed the grelee/build-wheel-on-ci branch 2 times, most recently from 1b7b409 to 6f7c6ef Compare October 27, 2023 03:04
@grlee77 grlee77 changed the title [draft] build wheels on CI build wheels on CI Oct 27, 2023
@grlee77 grlee77 self-assigned this Oct 27, 2023
Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving. I have a few comments but this is the last pass of review I think I need to give.

CONTRIBUTING.md Outdated Show resolved Hide resolved
ci/wheel_smoke_test.py Outdated Show resolved Hide resolved
dependencies.yaml Show resolved Hide resolved
- output_types: [conda]
packages:
- imagecodecs>=2021.6.8
- matplotlib-base
- openslide-python>=1.1.2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like imagecodecs above, it seems like openslide-python is pinned the same way for conda and requirements. Do you want to list that above in the [conda, requirements, pyproject] section?

Note that in most cases you want the same dependencies in requirements and pyproject because they both install from PyPI, so I'm questioning why we have both [requirements] and [requirements, pyproject] sections. I expect we really only need three sections: [conda, requirements, pyproject] for dependencies pinned the same way in conda/pip, [conda] for conda-only dependencies, and [requirements, pyproject] for pip-only dependencies.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed duplicate definitions. Install of imagecodecs and openslide-python is currently disabled at wheel test time, but we intend to fix that after this PR is merged (opened issue #626 to track)

packages:
- imagecodecs>=2021.6.8
- openslide-python>=1.1.2
# temporarily remove imagecodecs / openslide-python from wheel tests
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this code comment affects the review comments I left above. I don't see manylinux ARM builds of imagecodecs. I'll leave it to you to figure out the best course of action here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I need to open an issue to fix running tests when imagecodecs is installed (there is a single test case that fails on CI, but I couldn't reproduce using a wheel built on my local system). There was also an issue on arm64 where it tries to compile from source, but fails (only x86_64 wheels were available)

I went ahead and merged the requirements and pyproject sections and just comment it out in for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

opened issue #626 to track

Copy link
Member

@divyegala divyegala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VERSION_CPP is an interesting solution. It does seem to be more concise than what we are doing elsewhere

ci/release/update-version.sh Outdated Show resolved Hide resolved
.github/workflows/build.yaml Outdated Show resolved Hide resolved
ci/build_wheel.sh Outdated Show resolved Hide resolved
ci/build_wheel.sh Outdated Show resolved Hide resolved
@grlee77 grlee77 force-pushed the grelee/build-wheel-on-ci branch from a041436 to f65b052 Compare November 3, 2023 22:41
Copy link
Contributor

@vyasr vyasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

praise: Generally looks good! All the key changes are in place, there's just a little bit of cleanup to be done I think.

note: Looks like some formatting changes snuck in probably due to the change in the pre-commit config, but they're small and easily identified so not a big deal.

Comment on lines 16 to 17
# for CMake VERSION need to truncate any trailing 'a'
version_cpp=${version%a*}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (non-blocking): The alternative would be maintaining a single file and then stripping out the alpha in CMake. Now that it's done this way I don't know if it's worth changing, but it would let you carry around one less file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I did it this way because I am not familiar with the proper CMake syntax to do it. Happy to update it if you have suggestions on how the CMake code would look.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would probably do it (after the existing file(STRING...) command):

string(REGEX REPLACE "a.*$" "" VERSION ${VERSION})                                                                                                                                                        

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @vyasr, that seems to work! I pushed a commit to remove VERSION_CPP as well as simplify the CI scripts a bit (there was still writing of some unused VERSION file).

There should now be only a single top-level VERSION file (as well as symlink to it from python/cucim/src/cucim/VERSION)

package_dir="python/cucim"
package_src_dir="${package_dir}/src/${package_name}"

CMAKE_BUILD_TYPE="release"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thought: I was going to ask if this was really necessary since it's CMake's default, but now I see that this is for the build_local script. It's out of scope for this PR, but maybe that script should support a default since CMake does?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a default, but it is "debug" rather than release! Could potentially change it if there is no objection from @gigony

    local subcommand="${1:-all}"
    local build_type="${2:-debug}"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll leave that up to you two decide. It does seem odd to me that the default would be a debug build though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Raised as issue: #629

Comment on lines 39 to 40
echo "The package name and/or version was modified in the package source. The git diff is:"
git diff
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: JFYI we've removed this from every other repo at this point but you're welcome to keep it if you find it useful for debugging. It's been a long time since we got much use out of it.

Copy link
Contributor Author

@grlee77 grlee77 Nov 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have now removed it

echo "The package name and/or version was modified in the package source. The git diff is:"
git diff

pip install --upgrade pip
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (blocking): This shouldn't be necessary. Were you running into specific problems? If so, do we need to address them somewhere?

Copy link
Contributor Author

@grlee77 grlee77 Nov 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I was originally also installing a newer CMake etc here and was updating pip before doing that. I have now removed this line.

Comment on lines +44 to +49
# Install pip build dependencies (not yet using pyproject.toml)
rapids-dependency-file-generator \
--file_key "py_build" \
--output "requirements" \
--matrix "cuda=${RAPIDS_CUDA_VERSION%.*};arch=$(arch);py=${RAPIDS_PY_VERSION}" | tee build_requirements.txt
pip install -r build_requirements.txt
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (blocking): Is this still true? Now that #617 is merged you should be using pyproject.toml. That said, I don't know how the C++ build works when you do build_local. Are these dependencies somehow used there? That would be strange and require some gymnastics for library paths, but I suppose it's possible?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think it still needs to be this way as from the pyproject.toml perspective this is a pure Python package. The pyproject build relies on an external build step (build_local script in this file) having already built and copied the corresponding C++ .so files into the python/cucim/src/cucim/clara folder.

I agree that this is a bit non-standard and we can potentially look into moving to a scikit-build based approach in the future. I don't have time to focus on that at the moment, though.

One benefit of the approach as it is now is that even Windows users can build the pure Python package using pip without having to have CMake and compile the C++ library (which supports linux only). They would be able to use all of cucim.skimage and part of cucim.core, but just would not have the I/O functions in cucim.clara available.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get that there's a separate build_local run, what I'm confused about is what dependencies are being installed by pip that are used in that step? Is it just for CMake/ninja? If so, are these really "py_build" requirements?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's what I see on CI:

# To make changes, edit dependencies.yaml and run `rapids-dependency-file-generator`.
cmake>=3.23.1,!=3.25.0
ninja
setuptools>=24.2.0
wheel

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying deleting these lines in PR: #628

Comment on lines +25 to +26
elseif (EXISTS /usr/lib/aarch64-linux-gnu/libopenslide.so)
set(OPENSLIDE_LIB_PATH /usr/lib/aarch64-linux-gnu/libopenslide.so)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: I can see why this change was needed in general, but it doesn't seem relevant to this PR. Was it just snuck in out of convenience?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is needed here or the arm64 wheel builds fail to find libopenslide and fail during the run build_local call. The use of libopenslide is only needed for C++ lib benchmarks and test cases and it isn't directly used by the Python wheel, but it has to be found to run the C++ build stage (There is not currently a CMake flag to disable building the benchmark binary or tests).

@jakirkham jakirkham requested a review from vyasr November 6, 2023 20:13
@jakirkham jakirkham removed the request for review from vyasr November 6, 2023 21:58
@jakirkham
Copy link
Member

In discussion with Vyas offline, we've agreed to go ahead and merge. We can follow up on anything else in subsequent PRs/issues

@jakirkham
Copy link
Member

Thanks Greg for the PR and everyone for the reviews! 🙏

@jakirkham
Copy link
Member

/merge

@rapids-bot rapids-bot bot merged commit 7d56b5e into rapidsai:branch-23.12 Nov 6, 2023
36 checks passed
rapids-bot bot pushed a commit that referenced this pull request Dec 4, 2023
…ests will run (#634)

This MR may initially fail during wheel testing due to #626, as a few test cases relying on these packages were not being run at the time #619 was merged.

Authors:
  - Gregory Lee (https://github.com/grlee77)
  - https://github.com/jakirkham

Approvers:
  - Ray Douglass (https://github.com/raydouglass)
  - Gigon Bae (https://github.com/gigony)
  - https://github.com/jakirkham

URL: #634
@jakirkham jakirkham mentioned this pull request Dec 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request non-breaking Introduces a non-breaking change
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

7 participants