v23.01.00
This release introduces support for the put
and putmask
operations, adds an optimized implementation for the common case of advanced indexing using a single (possibly broadcasted) boolean array, includes more information in the tags of unary/binary operations on profiles (for easier cross-referencing with the source script), and adds some small improvements to OpenMP execution.
Conda packages for this release are available at https://anaconda.org/legate/cunumeric.
What's Changed
🐛 Bug Fixes
- Make the code compile with bounds checks by @magnatelee in #648
- MatVec & MatVecMul use reduction stores, not outputs by @manopapad in #646
- Set default generator based on whether ninja is available by @jjwilke in #602
- Allow args to be passed by position and name in auto_convert by @manopapad in #640
- Force positive values for log and sqrt tests by @jjwilke in #580
- Eliminate empty kernel launch in
cunumeric.unique
by @magnatelee in #675 - Make
install.py
reconfigure editable installs when build type changes by @trxcllnt in #670 - Fix for #684 by @magnatelee in #686
- Follow up on PR #671 by @ipdemes in #677
- More argument checks for
bincount
by @magnatelee in #711 - Fix a typo in unique.cu indexing by @manopapad in #713
- guard all2all from empty transfer by @mfoerste4 in #727
- src/cunumeric/item: add openmp variants for write/read tasks by @rohany in #740
- Fix CI failures due to numpy 1.24 upgrade by @manopapad in #745
- Fix timing for CuPy tests by @manopapad in #747
- Don't turn on cuNumeric debug checks on debug-rel builds by @manopapad in #753
- Move
pip uninstall
step before CMake is run instead of after. by @trxcllnt in #760 - Force conda version of cutensor by @marcinz in #765
- handle numpy 'builtins' properly for coverage by @bryevdv in #766
🚀 New Features
🛠️ Improvements
- Move test driver code to legate.core by @bryevdv in #627
- Remove --install-dir option by @bryevdv in #656
- Updates for new script-based conda env generation by @manopapad in #651
- Log operator names of unary and binary operations using annotations by @magnatelee in #679
- Regenerate
install_info.py
on every build by @trxcllnt in #705 - Fixes for buffer allocations by @magnatelee in #706
- Clean up the basic build instructions by @manopapad in #741
- Refactor benchmarks by @manopapad in #567
- Improving performance for some special cases of advanced indexing by @ipdemes in #731
- Pass
CMAKE_GENERATOR
to scikit-build by @trxcllnt in #750 - Change the default CPU architecture to haswell by @marcinz in #762
Full Changelog: v22.10.00...v23.01.00