Why does `Duplicated(x, dx)` assume `x` and `dx` have the same type? #1329

gdalle · 2024-03-06T12:36:07Z

A typical example where this is painful is writing a JVP/VJP into a row/column of a Jacobian. Then dx or x can be reshaped or views.

cc @adrhill

The text was updated successfully, but these errors were encountered:

MasonProtter · 2024-03-07T12:28:42Z

Basically, Enzyme is treating dx as a cartesian vector where the ith field of dx will be given by $\mathrm{d}x_{i} = {\partial f / \partial x_{i}}$, so it's useful to have dx be the exact same type, but the problem is that the interpretation of this object is not necessarily the same as interpreting a derivative.

Very long winded discussion here: #1334

vchuravy · 2024-03-07T16:17:42Z

We also assume congruency #636 i.e. that the shadow is structurally identically to the primal value. This is not a requirement on the Julia types, but on the datalayout of the objects, but in reality requiring types to be equal is an easier guardrail then determining full congurency #637

My favorite example of this is a sparse array. What should the shadow of it be? It is not enough to require that the shadow is of the same type, but it must additionally be congruent/"structurally identically" so it needs to have the same non-zero values (even though those values may be zero)

wsmoses · 2024-03-08T02:14:48Z

So as a technical detail the reason why the equal type is done isn't because it is necessarily required internally, but because it is a conservative approximation that prevents a lot of user error.

Like @MasonProtter said the way differentiation work is that for every variable at either register i (For any i) or byte offset i (for any byte offset, or pointer indirection), Enzyme will use the shadow variable at the same register/byte offset as storage for the corresponding derivative.

The precise memory locations accessed will depend on the function being differentiated. For example, if a primal only reads from an array at index 47, the derivative will only read/write to index 47 (and no other indices).

Since we can assume that all memory accesses on the primal are valid for the primal input, using an equivalent data structure for the shadow will always be valid (since at most we will access the same memory locations as the primal). Since a shadow object of the same type of the primal is guaranteed to have the same data layout as the primal data layout, requiring the primal data structure to be the same julia data type is a sufficient (but not necessary) constraint to this.

However, one can construct inputs with differing data layouts, but these come with safety issues that must be taken more seriously, as well as different semantic meanings.

function f(ptr)
	x = unsafe_load(ptr, 47)
	x * x
end

ptr = Base.reinterpret(Ptr{Float64}, Libc.malloc(100*sizeof(Float64)))
unsafe_store!(ptr, 3.14, 47)

@show f(ptr)

using Enzyme


dptr = Base.reinterpret(Ptr{Float64}, Libc.calloc(100*sizeof(Float64), 1))

autodiff(Reverse, f, Duplicated(ptr, dptr))

@show unsafe_load(dptr, 47)
# 6.28


dptr = Base.reinterpret(Ptr{Float64}, Libc.calloc(sizeof(Float64), 1))

# offset the pointer to have unsafe_load(dptr, 47) access the 0th byte of dptr
# since julia one indexes we subtract 46 * sizeof(Float64) here
autodiff(Reverse, f, Duplicated(ptr, dptr - 46 * sizeof(Float64)))

# represents the derivative of the 47'th elem of ptr, 
@show unsafe_load(dptr, 1)

# 6.28

wsmoses · 2024-03-08T02:17:44Z

Relatedly, a long discussion of this would make excellent docs. @gdalle do you want to test your understanding (and our explanation) and open a docs PR on the subject (obviously we'll help you make sure it is complete/accurate, but also having a voice who isn't already knowledgeable about the weeds is useful to making sure it is accessible)

wsmoses · 2024-03-08T02:21:07Z

This is similarly why we presently enforce that views have the same offsets/indices in shadow and primal:

Enzyme.jl/lib/EnzymeCore/src/EnzymeCore.jl

Line 66 in 5e4e2ef

    
           @inline function Duplicated(x::T1, dx::T1, check::Bool=true) where {T1 <: SubArray}

We'll otherwise store derivative data into the same/corresponding byte offset of the shadow (as computed by the primal's offset/indices). If the offsets are different, a user may get derivatives at an unexpected offset. Of course someone who knows what they're doing and really wants that behavior may dislike that check, but in that case they should probably just set the check flag to false.

wsmoses · 2024-03-13T18:09:27Z

bump @gdalle would you be interested in summarizing this into docs?

gdalle · 2024-03-13T18:23:01Z

yeah I'll give it a shot

gdalle · 2024-03-15T07:27:30Z

Done

gdalle mentioned this issue Mar 15, 2024

Add FAQ section to docs #1343

Merged

wsmoses closed this as completed in #1343 Mar 22, 2024

gdalle mentioned this issue Mar 25, 2024

Issues & PRs in other repos JuliaDiff/DifferentiationInterface.jl#99

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why does `Duplicated(x, dx)` assume `x` and `dx` have the same type? #1329

Why does `Duplicated(x, dx)` assume `x` and `dx` have the same type? #1329

gdalle commented Mar 6, 2024

MasonProtter commented Mar 7, 2024

vchuravy commented Mar 7, 2024

wsmoses commented Mar 8, 2024 •

edited

Loading

wsmoses commented Mar 8, 2024

wsmoses commented Mar 8, 2024

wsmoses commented Mar 13, 2024

gdalle commented Mar 13, 2024

gdalle commented Mar 15, 2024

Why does Duplicated(x, dx) assume x and dx have the same type? #1329

Why does Duplicated(x, dx) assume x and dx have the same type? #1329

Comments

gdalle commented Mar 6, 2024

MasonProtter commented Mar 7, 2024

vchuravy commented Mar 7, 2024

wsmoses commented Mar 8, 2024 • edited Loading

wsmoses commented Mar 8, 2024

wsmoses commented Mar 8, 2024

wsmoses commented Mar 13, 2024

gdalle commented Mar 13, 2024

gdalle commented Mar 15, 2024

Why does `Duplicated(x, dx)` assume `x` and `dx` have the same type? #1329

Why does `Duplicated(x, dx)` assume `x` and `dx` have the same type? #1329

wsmoses commented Mar 8, 2024 •

edited

Loading