You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
$ legate-issue
Python : 3.10.14 | packaged by conda-forge | (main, Mar 20 2024, 12:45:18) [GCC 12.3.0]
Platform : Linux-5.4.0-169-generic-x86_64-with-glibc2.31
Legion : legion-24.03.0
Legate : 24.01.00.dev+38.g90944d7
Cunumeric : 24.01.00.dev+32.g364e95dc.dirty
Numpy : 1.26.4
Scipy : 1.13.1
Numba : 0.59.1
CTK package : cuda-version-11.7-h67201e3_3 (conda-forge)
GPU driver : 535.54.03
GPU devices :
GPU 0: Tesla P100-SXM2-16GB
GPU 1: Tesla P100-SXM2-16GB
GPU 2: Tesla P100-SXM2-16GB
GPU 3: Tesla P100-SXM2-16GB
Jupyter notebook / Jupyter Lab version
No response
Expected behavior
Both iadd and add(x,y,out=x) should reuse the space occupied by a and not make a new copy. In the stack trace below, it appears the code is doing this:
t = copy(a)
t = task(a + b)
Observed behavior
This code gets to DeferredArray.binary_op. Here's a code snippet:
lhs = self.base
src1 = src1._copy_if_overlapping(self). # Is this line right?
rhs1 = src1._broadcast(lhs.shape)
src2 = src2._copy_if_overlapping(self)
rhs2 = src2._broadcast(lhs.shape)
The 2nd line compares a to a, confirms they are overlapping and copies the array into a new thunk.
Software versions
On Sapling2 at Stanford
$ legate-issue
Python : 3.10.14 | packaged by conda-forge | (main, Mar 20 2024, 12:45:18) [GCC 12.3.0]
Platform : Linux-5.4.0-169-generic-x86_64-with-glibc2.31
Legion : legion-24.03.0
Legate : 24.01.00.dev+38.g90944d7
Cunumeric : 24.01.00.dev+32.g364e95dc.dirty
Numpy : 1.26.4
Scipy : 1.13.1
Numba : 0.59.1
CTK package : cuda-version-11.7-h67201e3_3 (conda-forge)
GPU driver : 535.54.03
GPU devices :
GPU 0: Tesla P100-SXM2-16GB
GPU 1: Tesla P100-SXM2-16GB
GPU 2: Tesla P100-SXM2-16GB
GPU 3: Tesla P100-SXM2-16GB
Jupyter notebook / Jupyter Lab version
No response
Expected behavior
Both iadd and add(x,y,out=x) should reuse the space occupied by
a
and not make a new copy. In the stack trace below, it appears the code is doing this:Observed behavior
This code gets to DeferredArray.binary_op. Here's a code snippet:
The 2nd line compares
a
toa
, confirms they are overlapping and copies the array into a new thunk.Example code or instructions
Stack traceback or browser console output
It's the
create_empty_thunk
that looks strange to me. And the copy.copy is just more unneeded work.The text was updated successfully, but these errors were encountered: