-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why does Duplicated(x, dx)
assume x
and dx
have the same type?
#1329
Comments
Basically, Enzyme is treating Very long winded discussion here: #1334 |
We also assume congruency #636 i.e. that the shadow is structurally identically to the primal value. This is not a requirement on the Julia types, but on the datalayout of the objects, but in reality requiring types to be equal is an easier guardrail then determining full congurency #637 My favorite example of this is a sparse array. What should the shadow of it be? It is not enough to require that the shadow is of the same type, but it must additionally be congruent/"structurally identically" so it needs to have the same non-zero values (even though those values may be zero) |
So as a technical detail the reason why the equal type is done isn't because it is necessarily required internally, but because it is a conservative approximation that prevents a lot of user error. Like @MasonProtter said the way differentiation work is that for every variable at either register The precise memory locations accessed will depend on the function being differentiated. For example, if a primal only reads from an array at index 47, the derivative will only read/write to index 47 (and no other indices). Since we can assume that all memory accesses on the primal are valid for the primal input, using an equivalent data structure for the shadow will always be valid (since at most we will access the same memory locations as the primal). Since a shadow object of the same type of the primal is guaranteed to have the same data layout as the primal data layout, requiring the primal data structure to be the same julia data type is a sufficient (but not necessary) constraint to this. However, one can construct inputs with differing data layouts, but these come with safety issues that must be taken more seriously, as well as different semantic meanings. function f(ptr)
x = unsafe_load(ptr, 47)
x * x
end
ptr = Base.reinterpret(Ptr{Float64}, Libc.malloc(100*sizeof(Float64)))
unsafe_store!(ptr, 3.14, 47)
@show f(ptr)
using Enzyme
dptr = Base.reinterpret(Ptr{Float64}, Libc.calloc(100*sizeof(Float64), 1))
autodiff(Reverse, f, Duplicated(ptr, dptr))
@show unsafe_load(dptr, 47)
# 6.28
dptr = Base.reinterpret(Ptr{Float64}, Libc.calloc(sizeof(Float64), 1))
# offset the pointer to have unsafe_load(dptr, 47) access the 0th byte of dptr
# since julia one indexes we subtract 46 * sizeof(Float64) here
autodiff(Reverse, f, Duplicated(ptr, dptr - 46 * sizeof(Float64)))
# represents the derivative of the 47'th elem of ptr,
@show unsafe_load(dptr, 1)
# 6.28 |
Relatedly, a long discussion of this would make excellent docs. @gdalle do you want to test your understanding (and our explanation) and open a docs PR on the subject (obviously we'll help you make sure it is complete/accurate, but also having a voice who isn't already knowledgeable about the weeds is useful to making sure it is accessible) |
This is similarly why we presently enforce that views have the same offsets/indices in shadow and primal:
We'll otherwise store derivative data into the same/corresponding byte offset of the shadow (as computed by the primal's offset/indices). If the offsets are different, a user may get derivatives at an unexpected offset. Of course someone who knows what they're doing and really wants that behavior may dislike that check, but in that case they should probably just set the check flag to false. |
bump @gdalle would you be interested in summarizing this into docs? |
yeah I'll give it a shot |
Done |
A typical example where this is painful is writing a JVP/VJP into a row/column of a Jacobian. Then
dx
orx
can be reshaped or views.cc @adrhill
The text was updated successfully, but these errors were encountered: