v22.08.00
Release 22.08.00 features a variety of random distribution implementations (backed by cuRAND), distributed prefix scan operators, and a complete implementation of sorting for multi-node multi-CPU execution. This release also includes several quality-of-life changes and bug fixes, including type annotations for all but one Python module, improvements to the parallel test driver, fixes for several operators when inputs are empty, and proper handling of ndarrays passed as array sizes or indices.
Conda packages for this release are available at https://anaconda.org/legate/cunumeric.
New Features
- Adding support for ND output regions in Advanced Indexing task by @ipdemes in #370
- added support for 'searchsorted' by @mfoerste4 in #414
- np.packbits and np.unpackbits by @magnatelee in #427
- Implementation of atleast_{1,2,3}d by @sbak5 in #404
- Implementing cunumeric.random.BitGenerator by @fduguet-nv in #254
- Adding support for some simple _indices routines by @ipdemes in #417
- adding mask_indices routine by @ipdemes in #426
- Random advanced distributions by @fduguet-nv in #470
- Distributed nd sort for cpu/omp by @mfoerste4 in #437
- Initial implementation of scan routines. by @rkarim2 in #425
- Adding support for take_along_axis and put_along_axis by @ipdemes in #436
- cunumeric.ndim by @magnatelee in #495
- Add support for curand conda package build (cherry pick #510) by @marcinz in #512
Improvements
- Don't run the resolution logic if the arrays have the same dtype by @magnatelee in #389
- Set cuda virtual package as hard run requirement for gpu conda package by @m3vaz in #398
- First pass mypy typing by @bryevdv in #387
- Generalize Dict to Mapping for newer versions of mypy by @jjwilke in #405
- Add support for using cupy in sort.py by @robinw0928 in #395
- Refactor test.py by @bryevdv in #378
- Use Numpy axis normalizations where possible by @bryevdv in #419
- More mypy by @bryevdv in #413
- adding bounds check for advanced indexing by @ipdemes in #397
- Report Elapsed Time in cholesky's output by @SeyedMir in #423
- Support -vv for more verbose test output by @bryevdv in #432
- Add typing to runtime.py by @bryevdv in #428
- Update compress/take tests for pytest by @bryevdv in #435
- Project down to a 1D store for the scalar reduction output by @magnatelee in #455
- Fallback to self = np.ndarray when necessary by @bryevdv in #431
- Add types to thunk modules by @bryevdv in #438
- allclose detail + misc tests improvements by @bryevdv in #457
- cunumeric.random - Adding Module-scoped functions by @fduguet-nv in #481
- Activate the NumPy fallback for cunumeric.random in CPU build by @magnatelee in #485
- Legacy generators for cpu build by @magnatelee in #487
- Allow CPU build to optionally use cuRAND by @magnatelee in #498
- Sanitize shapes in ndarray's constructor by @magnatelee in #496
- src/cunumeric/sort: stop using std::{inclusive, exclusive}_scan by @rohany in #499
- Update conda requirements by @manopapad in #383
- Handle dtype/casting/out properly in contractions by @manopapad in #402
- Missing / overzealous check_eager_args calls by @manopapad in #465
- Strengthen some types by @manopapad in #468
Bug Fixes
- Add missing includes to aid intellisense providers by @trxcllnt in #382
- Proper exception handling for cholesky by @magnatelee in #391
- Fixes for building with setup.py outside conda, primarily Mac by @jjwilke in #394
- Use the right API to check if the store is unbound by @magnatelee in #399
- Fix nargs for report:dump-csv by @bryevdv in #400
- Handle empty outputs correctly in advanced indexing task by @magnatelee in #396
- Fall back to NumPy in array_function and array_ufunc by @magnatelee in #424
- Fix for legate data interface by @magnatelee in #429
- Fix test_floating.py test to call sys.exit by @marcinz in #433
- Make missing pynvml an error for GPU tests by @bryevdv in #441
- Make the NumPy fallback work correctly in randint by @magnatelee in #450
- Squeeze fix by @magnatelee in #448
- Correctly prune out empty tasks in binary reduction by @magnatelee in #453
- Minor fix for indexing routines by @magnatelee in #452
- Make DeferredArray.reshape always return a deferred array by @magnatelee in #454
- Re-freezing conda compiler versions (#415) by @m3vaz in #449
- Fix for floating point predicates by @magnatelee in #466
- markdown version fix by @ipdemes in #459
- Fixup typing regressions by @bryevdv in #471
- Remove ill-defined advanced indexing test case by @magnatelee in #484
- Handle empty inputs correctly in local scan tasks by @magnatelee in #491
- Handle an unknown in a tuple correctly in reshape by @magnatelee in #490
- fix mismatched size_t/uint64_t types by @jjwilke in #475
- Allow scalar cunumeric ndarrays as array indices by @manopapad in #479
Documentation
- adding new version for documentations by @ipdemes in #447
- Updates to api_compare.py by @bryevdv in #456
- Be stricter applying CuWrapperMetadata by @bryevdv in #463
- Add custom nitpicky ref checks for cunumeric APIs by @bryevdv in #462
- Docs coverage check by @bryevdv in #469
- Fix the API reference for random functions and scan operators by @magnatelee in #497
New Contributors
- @jjwilke made their first contribution in #394
- @SeyedMir made their first contribution in #423
- @fduguet-nv made their first contribution in #254
- @rkarim2 made their first contribution in #425
- @rohany made their first contribution in #499
Full Changelog: v22.05.02...v22.08.00