Skip to content

Releases: nv-legate/cupynumeric

v22.10.00

13 Oct 23:53
81ad156
Compare
Choose a tag to compare

The biggest change in Release 22.10 is a new build infrastructure using CMake and scikit-build. The new build system brings several benefits including robust build dependency tracking and compliance with Python site-packages. This release includes several new search and indexing operators, fixes for several performance and correctness bugs, and provenance tracking for top-level and ndarray routines in execution profiles.

Conda packages for this release are available at https://anaconda.org/legate/cunumeric.

What's Changed

🚀 New Features

• Argwhere and flatnonzero by @mfoerste4 in #525

🛠️ Improvements

  • adding support for array shape () passed as an index argument in advanced indexing by @ipdemes in #486
  • Refactor test driver for cpu/gpu sharding by @bryevdv in #451
  • Collate test output to allow workers > 1 with verbose output by @bryevdv in #507
  • Ensure test.py --use flag fully overrides USE_* envvars by @manopapad in #524
  • Enhance two integration tests by @robinw0928 in #511
  • Add typing to array.py by @bryevdv in #478
  • Update test runner for osx by @bryevdv in #529
  • Don't blindly trust user-supplied bincount.minlength by @manopapad in #523
  • Make reduced-precision cuBLAS mode opt-in by @manopapad in #519
  • Fix reciprocal tests for zero values and improve test value customization (#467) by @marcinz in #537
  • Refactor test runner to support more pinning options by @bryevdv in #535
  • Remove dead code ian bincount by @magnatelee in #546
  • Make the validation condition for random distributions lenient by @magnatelee in #550
  • src/cunumeric: handle high number of bins in GPU bincount by @rohany in #526
  • Construct NumPy arrays correctly from 0D deferred arrays backed by region fields by @magnatelee in #551
  • Collect test failure details at the end by @bryevdv in #556
  • Simplify some thunk conversion helpers by @manopapad in #553
  • Fix a compiler warning by @magnatelee in #555
  • Add option to disable CPU pinning in tests by @bryevdv in #558
  • Use the new mapper registration to enable detailed mapper logging by @magnatelee in #570
  • src/cunumeric/search: make nonzero not always allocate SYS_MEM buffers by @rohany in #572
  • add negative test case in test_array_split.py by @xialu00 in #545
  • add some test cases for test_arg_reduce.py by @xialu00 in #575
  • Testcase-add test cases for test_flip and test_indices by @xialu00 in #579
  • Refactor scalar reductions to use common execution policy by @jjwilke in #573
  • Sanitize k for the eye operator by @magnatelee in #586
  • Add CMake build for C++ and scikit-build infrastructure for Python package installation by @jjwilke in #514
  • Enhance test_block.py and test_eye.py by @robinw0928 in #578
  • Testcase add test cases for test_fill.py and test_ndim.py by @xialu00 in #588
  • Remove run dependency on curand by @marcinz in #520
  • Use Legion Fills when possible by @manopapad in #604
  • Support building with GASNet-Ex and MPI backends by @manopapad in #610
  • Provenance tracking for cuNumeric operators by @magnatelee in #596
  • Fix tests utils to make --directory work correctly. by @robinw0928 in #592
  • Fix a compiler warning by @magnatelee in #594
  • Enhance test_diag_indices.py and test_flatten.py. by @robinw0928 in #609
  • cuNumeric doesn't need nested provenance tracking by @magnatelee in #617
  • Add RuntimeError exception to legate.time by @robinw0928 in #618
  • Stop instantiating min and max reduction ops for complex types by @magnatelee in #621
  • Mark temporary conversion outputs as linear for eager storage recycling by @magnatelee in #608
  • Make the negative test on fill robust across Python versions by @magnatelee in #619
  • Enhance mask_indices and move_axis by @robinw0928 in #622
  • src/cunumeric/matrix: stop including coll.h in solve_template.inl by @rohany in #620

🐛 Bug Fixes

  • Fix performance bugs in scalar reductions by @magnatelee in #509
  • Don't use internal LAPACK function names by @manopapad in #522
  • Bug fixes for advanced indexing by @magnatelee in #532
  • Handle the case where LAPACK_*potrf is a macro, not a function by @manopapad in #527
  • fix mypy issue w/ np methods by @bryevdv in #542
  • Fix buggy complex-to-bool conversions and add correctness tests for astype by @magnatelee in #549
  • fixing advanced indexing operation for empty arrays by @ipdemes in #504
  • Do not link curand by @marcinz in #541
  • Fixing issues with advanced_indexing_kernel by @ipdemes in #557
  • fixing another corner case for advanced indexing by @ipdemes in #554
  • Fix OSX test shard generation by @bryevdv in #563
  • fix error print in test_unary_ufunc by @jjwilke in #566
  • Add NAN handling to convert() needed for some prefix routines with integer outputs. by @rkarim2 in #502
  • Fixing logic for slicing by @ipdemes in #574
  • Fix linalg.solve when inputs are scalars by @magnatelee in #585
  • Allow casting in cn.dot, to match numpy's behavior by @manopapad in #598
  • Add linalg.solve to the cmake build by @magnatelee in #603
  • Invoke eye with read-write privilege, not write-discard by @manopapad in #616
  • Fix a bug in scalar reduction launching kernels with empty domains by @magnatelee in #606

📖 Documentation

  • Added note to prefix documentation for corner cases where cunumeric results can diverge from numpy by @rkarim2 in #528
  • updating documentation by @ipdemes in #614
  • Add missing docs symlink by @bryevdv in #635

v22.08.00

09 Aug 03:38
ece6585
Compare
Choose a tag to compare

Release 22.08.00 features a variety of random distribution implementations (backed by cuRAND), distributed prefix scan operators, and a complete implementation of sorting for multi-node multi-CPU execution. This release also includes several quality-of-life changes and bug fixes, including type annotations for all but one Python module, improvements to the parallel test driver, fixes for several operators when inputs are empty, and proper handling of ndarrays passed as array sizes or indices.

Conda packages for this release are available at https://anaconda.org/legate/cunumeric.

New Features

Improvements

Bug Fixes

Documentation

New Contributors

Full Changelog: v22.05.02...v22.08.00

v22.05.02

21 Jun 10:52
8b163e6
Compare
Choose a tag to compare

This hotfix release fixes issues in conda recipes.

What's Changed

  • Cherry pick: Update conda requirements (#383) by @marcinz in #406
  • Cherry pick: Set cuda virtual package as hard run requirement for conda gpu package (#398) by @marcinz in #407
  • Cherry pick: Fix nargs for report:dump-csv (#400) by @marcinz in #408
  • Re-freezing conda compiler versions by @m3vaz in #415

Full Changelog: v22.05.01...v22.05.02

v22.05.01

16 Jun 20:44
7fcbf60
Compare
Choose a tag to compare

This hotfix release updates the conda build recipe to make the cuNumeric package depend on the right version of NumPy and also fixes a bug in the command-line argument parser.

Full Changelog: v22.05.00...v22.05.01

v22.05.00

07 Jun 03:36
0a642e8
Compare
Choose a tag to compare

Release 22.05 features complete support for advanced indexing and related indexing routines (compress and take), a multi-node multi-GPU sorting implementation for multi-dimensional ndarrays, window functions, several matrix/tensor operations (trace, matrix_power, multi_dot, and einsum_path) and primitive support for FFT on a single GPU using cuFFT.

Conda packages for this release are available at https://anaconda.org/legate/cunumeric.

New Features

Improvements

Bug Fixes

Documentation

New Contributors

Full Changelog: v22.03.00...v22.05.00

v22.03.00

05 Apr 00:35
5e0e6b3
Compare
Choose a tag to compare

Release 22.03 adds several new features, including np.repeat, np.unique, np.inner, np.outer, and 35 new universal functions (ufuncs). In this release, we also have significantly revised and refactored tensor operations to make them comprehensive. Preliminary support for 1D array sorting for multi-GPU execution is available. (CPU and OpenMP paths are still single processor only.) We have also made performance improvements for np.convolve and np.tril/trilu for GPU execution. Finally, we have added a tool that reports cuNumeric’s API coverage for a given NumPy program execution. (For the usage, please refer to “Measuring API coverage” in the cuNumeric documentation.)

Conda packages for this release are available at https://anaconda.org/legate/cunumeric.

New Features

Improvements

Bug Fixes

Documentation

  • Add docstrings to ndarray methods by @bryevdv in #205
  • Clean up Sphinx warnings by @bryevdv in #202
  • adding versions to the documentation by @ipdemes in #198
  • adding script for comparing API coverage + table at the documentation page by @ipdemes in #193
  • User facing documentation for API usage tool by @bryevdv in #262

Full Changelog: v22.01.00...v22.03.00

v22.01.00

10 Feb 02:26
27a3248
Compare
Choose a tag to compare

Release 22.01 adds support for einsum expressions, logic functions and a subset of indexing and array manipulation routines.

Conda packages for this release are available at https://anaconda.org/legate/cunumeric.

New Features

Improvements

Bug Fixes

Documentation

New Contributors

Full Changelog: v21.11.00...v22.01.00

v21.11.00

09 Nov 02:33
1270b3c
Compare
Choose a tag to compare

This is the initial public alpha release of cuNumeric, an aspiring drop-in replacement for NumPy at scale.

Conda packages for this release are available at https://anaconda.org/legate/cunumeric.

What's Changed

New Contributors

Full Changelog: https://github.com/nv-legate/cunumeric/commits/v21.11.00