Skip to content

Commit

Permalink
Merge branch 'main' into flip-launch-args
Browse files Browse the repository at this point in the history
  • Loading branch information
ksimpson-work authored Jan 1, 2025
2 parents 07311af + 856662f commit 7323cab
Show file tree
Hide file tree
Showing 10 changed files with 49 additions and 43 deletions.
9 changes: 3 additions & 6 deletions .github/workflows/gh-build-and-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -300,9 +300,9 @@ jobs:
- name: Run cuda.core tests
shell: bash --noprofile --norc -xeuo pipefail {0}
run: |
if [[ $SKIP_CUDA_BINDINGS_TEST == 1 ]]; then
if [[ ${{ matrix.python-version }} == "3.13" ]]; then
# TODO: remove this hack once cuda-python has a cp313 build
if [[ ${{ matrix.python-version }} == "3.13" ]]; then
if [[ $SKIP_CUDA_BINDINGS_TEST == 1 ]]; then
echo "Python 3.13 + cuda-python ${{ matrix.cuda-version }} is not supported, skipping the test..."
exit 0
fi
Expand All @@ -316,9 +316,6 @@ jobs:
popd
pushd ./cuda_core
# TODO: add requirements.txt for test deps?
pip install pytest
# TODO: add CuPy to test deps (which would require cuRAND)
# pip install "cupy-cuda${TEST_CUDA_MAJOR}x"
pip install -r "tests/requirements-cu${TEST_CUDA_MAJOR}.txt"
pytest -rxXs tests/
popd
32 changes: 13 additions & 19 deletions .github/workflows/triagelabel.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,27 +2,21 @@ name: Add Triage Label

on:
issues:
types: [opened]
types:
- reopened
- opened

jobs:
triage:
runs-on: ubuntu-latest
permissions:
issues: write
steps:
- name: Check for existing labels
id: check_labels
uses: actions/github-script@v6
with:
script: |
const labels = await github.issues.listLabelsOnIssue({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: context.issue.number
});
return labels.data.length > 0;
- name: Add Triage Label
if: steps.check_labels.outputs.result == 'false'
uses: actions-ecosystem/action-add-labels@v1
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
labels: triage
- name: Add or check for existing labels
# add the triage label only if no label has been added
if: ${{ github.event.issue.labels[0] == null }}
run: gh issue edit "$NUMBER" --add-label "triage"
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
GH_REPO: ${{ github.repository }}
NUMBER: ${{ github.event.issue.number }}
2 changes: 2 additions & 0 deletions cuda_core/docs/source/install.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,8 @@ dependencies are as follows:

[^1]: Including `cuda-python`.

`cuda.core` supports Python 3.9 - 3.12, on Linux (x86-64, arm64) and Windows (x86-64).

## Installing from PyPI

`cuda.core` works with `cuda.bindings` (part of `cuda-python`) 11 or 12. For example with CUDA 12:
Expand Down
2 changes: 1 addition & 1 deletion cuda_core/docs/source/release/0.1.0-notes.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# `cuda.core` Release notes
# `cuda.core` v0.1.0 Release notes

Released on Nov 8, 2024

Expand Down
25 changes: 15 additions & 10 deletions cuda_core/docs/source/release/0.1.1-notes.md
Original file line number Diff line number Diff line change
@@ -1,25 +1,28 @@
# `cuda.core` Release notes
# `cuda.core` v0.1.1 Release notes

Released on Dec XX, 2024
Released on Dec 20, 2024

## Hightlights

- Add `StridedMemoryView` and `@args_viewable_as_strided_memory` that provide a concrete
implementation of DLPack & CUDA Array Interface supports.
- Add `Linker` that can link one or multiple `ObjectCode` instances generated by `Program`s. Under
the hood, it uses either the nvJitLink or cuLink APIs depending on the CUDA version detected
in the current environment.
- Add a `cuda.core.experimental.system` module for querying system- or process- wide information.
- Support TCC devices with a default synchronous memory resource to avoid the use of memory pools
- Add `Linker` that can link one or multiple `ObjectCode` instances generated by `Program`. Under
the hood, it uses either the nvJitLink or driver (`cuLink*`) APIs depending on the CUDA version
detected in the current environment.
- Support `pip install cuda-core`. Please see the Installation Guide for further details.

## New features

- Add a `cuda.core.experimental.system` module for querying system- or process- wide information.
- Add `LaunchConfig.cluster` to support thread block clusters on Hopper GPUs.

## Enchancements

- Ensure "ltoir" is a valid code type to `ObjectCode`.
- Improve test coverage.
- The internal handle held by `ObjectCode` is now lazily initialized upon first touch.
- Support TCC devices with a default synchronous memory resource to avoid the use of memory pools.
- Ensure `"ltoir"` is a valid code type to `ObjectCode`.
- Document the `__cuda_stream__` protocol.
- Improve test coverage & documentation cross-references.
- Enforce code formatting.

## Bug fixes
Expand All @@ -35,4 +38,6 @@ Released on Dec XX, 2024
not supported. This will be fixed in a future release.
- Some `LinkerOptions` are only available when using a modern version of CUDA. When using CUDA <12,
the backend is the cuLink api which supports only a subset of the options that nvjitlink does.
Further, some options aren't available on CUDA versions <12.6
Further, some options aren't available on CUDA versions <12.6.
- To use `cuda.core` with Python 3.13, it currently requires building `cuda-python` from source
prior to `pip install`. This extra step will be fixed soon.
9 changes: 5 additions & 4 deletions cuda_core/examples/saxpy.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,8 +47,9 @@
# prepare input/output
size = cp.uint64(64)
a = dtype(10)
x = cp.random.random(size, dtype=dtype)
y = cp.random.random(size, dtype=dtype)
rng = cp.random.default_rng()
x = rng.random(size, dtype=dtype)
y = rng.random(size, dtype=dtype)
out = cp.empty_like(x)
dev.sync() # cupy runs on a different stream from s, so sync before accessing

Expand All @@ -73,8 +74,8 @@
# prepare input
size = cp.uint64(128)
a = dtype(42)
x = cp.random.random(size, dtype=dtype)
y = cp.random.random(size, dtype=dtype)
x = rng.random(size, dtype=dtype)
y = rng.random(size, dtype=dtype)
dev.sync()

# prepare output
Expand Down
2 changes: 1 addition & 1 deletion cuda_core/examples/strided_memory_view.py
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,7 @@
gpu_prog = Program(gpu_code, code_type="c++")
# To know the GPU's compute capability, we need to identify which GPU to use.
dev = Device(0)
dev.set_current()
arch = "".join(f"{i}" for i in dev.compute_capability)
mod = gpu_prog.compile(
target_type="cubin",
Expand Down Expand Up @@ -156,7 +157,6 @@ def my_func(arr, work_stream):

# This takes the GPU path
if cp:
dev.set_current()
s = dev.create_stream()
# Create input array on GPU
arr_gpu = cp.ones(1024, dtype=cp.int32)
Expand Down
5 changes: 3 additions & 2 deletions cuda_core/examples/vector_add.py
Original file line number Diff line number Diff line change
Expand Up @@ -42,8 +42,9 @@

# prepare input/output
size = 50000
a = cp.random.random(size, dtype=dtype)
b = cp.random.random(size, dtype=dtype)
rng = cp.random.default_rng()
a = rng.random(size, dtype=dtype)
b = rng.random(size, dtype=dtype)
c = cp.empty_like(a)

# cupy runs on a different stream from s, so sync before accessing
Expand Down
3 changes: 3 additions & 0 deletions cuda_core/tests/requirements-cu11.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
pytest
# TODO: remove this hack once cupy has a cp313 build
cupy-cuda11x; python_version < "3.13"
3 changes: 3 additions & 0 deletions cuda_core/tests/requirements-cu12.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
pytest
# TODO: remove this hack once cupy has a cp313 build
cupy-cuda12x; python_version < "3.13"

0 comments on commit 7323cab

Please sign in to comment.