diff --git a/dev/.documenter-siteinfo.json b/dev/.documenter-siteinfo.json
index 25ff11b328..28cb5f8e95 100644
--- a/dev/.documenter-siteinfo.json
+++ b/dev/.documenter-siteinfo.json
@@ -1 +1 @@
-{"documenter":{"julia_version":"1.11.2","generation_timestamp":"2024-12-06T08:54:14","documenter_version":"1.8.0"}}
\ No newline at end of file
+{"documenter":{"julia_version":"1.11.2","generation_timestamp":"2024-12-06T21:09:44","documenter_version":"1.8.0"}}
\ No newline at end of file
diff --git a/dev/api/index.html b/dev/api/index.html
index 9a16a73515..ba9fdfad79 100644
--- a/dev/api/index.html
+++ b/dev/api/index.html
@@ -14,7 +14,7 @@
(var"1" = (var"1" = [0.3, 0.1, 0.2], var"2" = [0.03, 0.01, 0.02]),)
(var"1" = [0.3, 0.1, 0.2],)
(var"1" = [0.0, 1.0, 2.0], var"2" = [0.3, 0.1, 0.2])
-sourceEnzyme.@import_rrule — Macro
import_rrule(::fn, tys...)
Automatically import a ChainRules.rrule as a custom reverse mode EnzymeRule. When called in batch mode, this will end up calling the primal multiple times which results in slower code. This macro assumes that the underlying function to be imported is read-only, and returns a Duplicated or Const object. This macro also assumes that the inputs permit a .+= operation and that the output has a valid Enzyme.make_zero function defined. It also assumes that overwritten(x) accurately describes if there is any non-preserved data from forward to reverse, not just the outermost data structure being overwritten as provided by the specification.
Finally, this macro falls back to almost always caching all of the inputs, even if it may not be needed for the derivative computation.
As a result, this auto importer is also likely to be slower than writing your own rule, and may also be slower than not having a rule at all.
Compute the gradient of an array-input function f using reverse mode, storing the derivative result in an existing array dx. Both x and dx must be Arrays of the same type.
Automatically import a ChainRules.rrule as a custom reverse mode EnzymeRule. When called in batch mode, this will end up calling the primal multiple times which results in slower code. This macro assumes that the underlying function to be imported is read-only, and returns a Duplicated or Const object. This macro also assumes that the inputs permit a .+= operation and that the output has a valid Enzyme.make_zero function defined. It also assumes that overwritten(x) accurately describes if there is any non-preserved data from forward to reverse, not just the outermost data structure being overwritten as provided by the specification.
Finally, this macro falls back to almost always caching all of the inputs, even if it may not be needed for the derivative computation.
As a result, this auto importer is also likely to be slower than writing your own rule, and may also be slower than not having a rule at all.
Compute the gradient of an array-input function f using reverse mode, storing the derivative result in an existing array dx. Both x and dx must be Arrays of the same type.
Compute the gradient of an array-input function f using forward mode. The optional keyword argument shadow is a vector of one-hot vectors of type x which are used to forward-propagate into the return. For performance reasons, this should be computed once, outside the call to gradient, rather than within this call.
Example:
f(x) = x[1]*x[2]
+(derivs = ([3.0, 2.0],), val = 6.0)
Compute the gradient of an array-input function f using forward mode. The optional keyword argument shadow is a vector of one-hot vectors of type x which are used to forward-propagate into the return. For performance reasons, this should be computed once, outside the call to gradient, rather than within this call.
Compute an in-place Hessian-vector product of an array-input scalar-output function f, as evaluated at x times the vector v. The result will be stored into res. The function still allocates and zero's a buffer to store the intermediate gradient, which is not returned to the user.
In other words, compute res .= hessian(f)(x) * v
See hvp_and_gradient! for a function to compute both the hvp and the gradient in a single call.
Compute an in-place Hessian-vector product of an array-input scalar-output function f, as evaluated at x times the vector v. The result will be stored into res. The function still allocates and zero's a buffer to store the intermediate gradient, which is not returned to the user.
In other words, compute res .= hessian(f)(x) * v
See hvp_and_gradient! for a function to compute both the hvp and the gradient in a single call.
Compute the Hessian-vector product of an array-input scalar-output function f, as evaluated at x times the vector v.
In other words, compute hessian(f)(x) * v
See hvp! for a version which stores the result in an existing buffer and also hvp_and_gradient! for a function to compute both the hvp and the gradient in a single call.
Compute the Hessian-vector product of an array-input scalar-output function f, as evaluated at x times the vector v.
In other words, compute hessian(f)(x) * v
See hvp! for a version which stores the result in an existing buffer and also hvp_and_gradient! for a function to compute both the hvp and the gradient in a single call.
hvp_and_gradient!(res::X, grad::X, f::F, x::X, v::X) where {F, X}
Compute an in-place Hessian-vector product of an array-input scalar-output function f, as evaluated at x times the vector v as well as the gradient, storing the gradient into grad. Both the hessian vector product and the gradient can be computed together more efficiently than computing them separately.
The result will be stored into res. The gradient will be stored into grad.
In other words, compute res .= hessian(f)(x) * v and grad .= gradient(Reverse, f)(x)
hvp_and_gradient!(res::X, grad::X, f::F, x::X, v::X) where {F, X}
Compute an in-place Hessian-vector product of an array-input scalar-output function f, as evaluated at x times the vector v as well as the gradient, storing the gradient into grad. Both the hessian vector product and the gradient can be computed together more efficiently than computing them separately.
The result will be stored into res. The gradient will be stored into grad.
In other words, compute res .= hessian(f)(x) * v and grad .= gradient(Reverse, f)(x)
Compute the jacobian of a array-output function f using (potentially vector) reverse mode. The chunk argument optionally denotes the chunk size to use and n_outs optionally denotes the shape of the array returned by f (e.g size(f(x))).
This function will return an AbstractArray whose shape is (size(output)..., size(input)...). No guarantees are presently made about the type of the AbstractArray returned by this function (which may or may not be the same as the input AbstractArray if provided).
In the future, when this function is extended to handle non-array return types, this function will retun an AbstractArray of shape size(output) of values of the input type. ```
Auto-differentiate function f at arguments args using forward mode.
args may be numbers, arrays, structs of numbers, structs of arrays and so on. Enzyme will only differentiate in respect to arguments that are wrapped in a Duplicated or similar argument. Unlike reverse mode in autodiff, Active arguments are not allowed here, since all derivative results of immutable objects will be returned and should instead use Duplicated or variants like DuplicatedNoNeed.
Activity is the Activity of the return value, it may be:
Const if the return is not to be differentiated with respect to
Duplicated, if the return is being differentiated with respect to
BatchDuplicated, like Duplicated, but computing multiple derivatives at once. All batch sizes must be the same for all arguments.
Example returning both original return and derivative:
This function will return an AbstractArray whose shape is (size(output)..., size(input)...). No guarantees are presently made about the type of the AbstractArray returned by this function (which may or may not be the same as the input AbstractArray if provided).
In the future, when this function is extended to handle non-array return types, this function will retun an AbstractArray of shape size(output) of values of the input type. ```
Auto-differentiate function f at arguments args using forward mode.
args may be numbers, arrays, structs of numbers, structs of arrays and so on. Enzyme will only differentiate in respect to arguments that are wrapped in a Duplicated or similar argument. Unlike reverse mode in autodiff, Active arguments are not allowed here, since all derivative results of immutable objects will be returned and should instead use Duplicated or variants like DuplicatedNoNeed.
Activity is the Activity of the return value, it may be:
Const if the return is not to be differentiated with respect to
Duplicated, if the return is being differentiated with respect to
BatchDuplicated, like Duplicated, but computing multiple derivatives at once. All batch sizes must be the same for all arguments.
Example returning both original return and derivative:
Auto-differentiate function f at arguments args using reverse mode.
Limitations:
f may only return a Real (of a built-in/primitive type) or nothing, not an array, struct, BigFloat, etc. To handle vector-valued return types, use a mutating f! that returns nothing and stores it's return value in one of the arguments, which must be wrapped in a Duplicated.
args may be numbers, arrays, structs of numbers, structs of arrays and so on. Enzyme will only differentiate in respect to arguments that are wrapped in an Active (for arguments whose derivative result must be returned rather than mutated in place, such as primitive types and structs thereof) or Duplicated (for mutable arguments like arrays, Refs and structs thereof).
Activity is the Activity of the return value, it may be Const or Active.
Auto-differentiate function f at arguments args using reverse mode.
Limitations:
f may only return a Real (of a built-in/primitive type) or nothing, not an array, struct, BigFloat, etc. To handle vector-valued return types, use a mutating f! that returns nothing and stores it's return value in one of the arguments, which must be wrapped in a Duplicated.
args may be numbers, arrays, structs of numbers, structs of arrays and so on. Enzyme will only differentiate in respect to arguments that are wrapped in an Active (for arguments whose derivative result must be returned rather than mutated in place, such as primitive types and structs thereof) or Duplicated (for mutable arguments like arrays, Refs and structs thereof).
Activity is the Activity of the return value, it may be Const or Active.
Example:
a = 4.2
b = [2.2, 3.3]; ∂f_∂b = zero(b)
c = 55; d = 9
@@ -126,13 +126,13 @@
# output
-((6.0,), 9.0)
Note
Enzyme gradients with respect to integer values are zero. Active will automatically convert plain integers to floating point values, but cannot do so for integer values in tuples and structs.
Specialization of autodiff to handle do argument closures.
+((6.0,), 9.0)
Note
Enzyme gradients with respect to integer values are zero. Active will automatically convert plain integers to floating point values, but cannot do so for integer values in tuples and structs.
Same as autodiff(::ForwardMode, f, Activity, args...) but uses deferred compilation to support usage in GPU code, as well as high-order differentiation.
Provide the split forward and reverse pass functions for annotated function type ftype when called with args of type argtypes when using reverse mode.
Activity is the Activity of the return value, it may be Const, Active, or Duplicated (or its variants DuplicatedNoNeed, BatchDuplicated, and BatchDuplicatedNoNeed).
The forward function will return a tape, the primal (or nothing if not requested), and the shadow (or nothing if not a Duplicated variant), and tapes the corresponding type arguements provided.
The reverse function will return the derivative of Active arguments, updating the Duplicated arguments in place. The same arguments to the forward pass should be provided, followed by the adjoint of the return (if the return is active), and finally the tape from the forward pass.
Same as autodiff(::ForwardMode, f, Activity, args...) but uses deferred compilation to support usage in GPU code, as well as high-order differentiation.
Provide the split forward and reverse pass functions for annotated function type ftype when called with args of type argtypes when using reverse mode.
Activity is the Activity of the return value, it may be Const, Active, or Duplicated (or its variants DuplicatedNoNeed, BatchDuplicated, and BatchDuplicatedNoNeed).
The forward function will return a tape, the primal (or nothing if not requested), and the shadow (or nothing if not a Duplicated variant), and tapes the corresponding type arguements provided.
The reverse function will return the derivative of Active arguments, updating the Duplicated arguments in place. The same arguments to the forward pass should be provided, followed by the adjoint of the return (if the return is active), and finally the tape from the forward pass.
Example:
A = [2.2]; ∂A = zero(A)
v = 3.3
@@ -152,7 +152,7 @@
# output
-(7.26, 2.2, [3.3])
Provide the thunk forward mode function for annotated function type ftype when called with args of type argtypes.
Activity is the Activity of the return value, it may be Const or Duplicated (or its variants DuplicatedNoNeed, BatchDuplicated, andBatchDuplicatedNoNeed).
The forward function will return the primal (if requested) and the shadow (or nothing if not a Duplicated variant).
Example returning both the return derivative and original return:
Provide the thunk forward mode function for annotated function type ftype when called with args of type argtypes.
Activity is the Activity of the return value, it may be Const or Duplicated (or its variants DuplicatedNoNeed, BatchDuplicated, andBatchDuplicatedNoNeed).
The forward function will return the primal (if requested) and the shadow (or nothing if not a Duplicated variant).
Example returning both the return derivative and original return:
a = 4.2
b = [2.2, 3.3]; ∂f_∂b = zero(b)
c = 55; d = 9
@@ -172,7 +172,7 @@
# output
-(6.28,)
Provide the split forward and reverse pass functions for annotated function type ftype when called with args of type argtypes when using reverse mode.
Activity is the Activity of the return value, it may be Const, Active, or Duplicated (or its variants DuplicatedNoNeed, BatchDuplicated, and BatchDuplicatedNoNeed).
The forward function will return a tape, the primal (or nothing if not requested), and the shadow (or nothing if not a Duplicated variant), and tapes the corresponding type arguements provided.
The reverse function will return the derivative of Active arguments, updating the Duplicated arguments in place. The same arguments to the forward pass should be provided, followed by the adjoint of the return (if the return is active), and finally the tape from the forward pass.
Provide the split forward and reverse pass functions for annotated function type ftype when called with args of type argtypes when using reverse mode.
Activity is the Activity of the return value, it may be Const, Active, or Duplicated (or its variants DuplicatedNoNeed, BatchDuplicated, and BatchDuplicatedNoNeed).
The forward function will return a tape, the primal (or nothing if not requested), and the shadow (or nothing if not a Duplicated variant), and tapes the corresponding type arguements provided.
The reverse function will return the derivative of Active arguments, updating the Duplicated arguments in place. The same arguments to the forward pass should be provided, followed by the adjoint of the return (if the return is active), and finally the tape from the forward pass.
Example:
A = [2.2]; ∂A = zero(A)
v = 3.3
@@ -191,7 +191,7 @@
# output
-(7.26, 2.2, [3.3])
Mark a function argument x of autodiff as active, Enzyme will auto-differentiate in respect Active arguments.
Note
Enzyme gradients with respect to integer values are zero. Active will automatically convert plain integers to floating point values, but cannot do so for integer values in tuples and structs.
Like Duplicated, except contains several shadows to compute derivatives for all at once. Argument ∂f_∂xs should be a tuple of the several values of type x.
Like DuplicatedNoNeed, except contains several shadows to compute derivatives for all at once. Argument ∂f_∂xs should be a tuple of the several values of type x.
Mark a function argument x of autodiff as duplicated, Enzyme will auto-differentiate in respect to such arguments, with dx acting as an accumulator for gradients (so $\partial f / \partial x$ will be added to) ∂f_∂x.
Like Duplicated, except also specifies that Enzyme may avoid computing the original result and only compute the derivative values. This creates opportunities for improved performance.
Mark a function argument x of autodiff as active, Enzyme will auto-differentiate in respect Active arguments.
Note
Enzyme gradients with respect to integer values are zero. Active will automatically convert plain integers to floating point values, but cannot do so for integer values in tuples and structs.
Like Duplicated, except contains several shadows to compute derivatives for all at once. Argument ∂f_∂xs should be a tuple of the several values of type x.
Like DuplicatedNoNeed, except contains several shadows to compute derivatives for all at once. Argument ∂f_∂xs should be a tuple of the several values of type x.
Mark a function argument x of autodiff as duplicated, Enzyme will auto-differentiate in respect to such arguments, with dx acting as an accumulator for gradients (so $\partial f / \partial x$ will be added to) ∂f_∂x.
Like Duplicated, except also specifies that Enzyme may avoid computing the original result and only compute the derivative values. This creates opportunities for improved performance.
function square_byref(out, v)
out[] = v * v
nothing
@@ -203,18 +203,18 @@
dout[]
# output
-0.0
For example, marking the out variable as DuplicatedNoNeed instead of Duplicated allows Enzyme to avoid computing v * v (while still computing its derivative).
This should only be used if x is a write-only variable. Otherwise, if the differentiated function stores values in x and reads them back in subsequent computations, using DuplicatedNoNeed may result in incorrect derivatives. In particular, DuplicatedNoNeed should not be used for preallocated workspace, even if the user might not care about its final value, as marking a variable as NoNeed means that reads from the variable are now undefined.
For example, marking the out variable as DuplicatedNoNeed instead of Duplicated allows Enzyme to avoid computing v * v (while still computing its derivative).
This should only be used if x is a write-only variable. Otherwise, if the differentiated function stores values in x and reads them back in subsequent computations, using DuplicatedNoNeed may result in incorrect derivatives. In particular, DuplicatedNoNeed should not be used for preallocated workspace, even if the user might not care about its final value, as marking a variable as NoNeed means that reads from the variable are now undefined.
The type parameters of ForwardMode are not part of the public API and can change without notice. Please use one of the following concrete instantiations instead:
The type parameters of ForwardMode are not part of the public API and can change without notice. Please use one of the following concrete instantiations instead:
The type parameters of ReverseMode are not part of the public API and can change without notice. Please use one of the following concrete instantiations instead:
The type parameters of ReverseMode are not part of the public API and can change without notice. Please use one of the following concrete instantiations instead:
The type parameters of ReverseModeSplit are not part of the public API and can change without notice. Please use one of the following concrete instantiations instead:
The type parameters of ReverseModeSplit are not part of the public API and can change without notice. Please use one of the following concrete instantiations instead:
Turn a ReverseMode object into a ReverseModeSplit object while preserving as many of the settings as possible. The rest of the settings can be configured with optional positional arguments of Val type.
Returns a GPUCompiler CompilerJob from a backend as specified by the first argument to the function.
For example, in CUDA one would do:
function EnzymeCore.compiler_job_from_backend(::CUDABackend, @nospecialize(F::Type), @nospecialize(TT::Type))
+)
Turn a ReverseMode object into a ReverseModeSplit object while preserving as many of the settings as possible. The rest of the settings can be configured with optional positional arguments of Val type.
Recursively make a zero'd copy of the value prev of type T. The argument copy_if_inactive specifies what to do if the type T is guaranteed to be inactive, use the primal (the default) or still copy the value.
Recursively make a zero'd copy of the value prev of type T. The argument copy_if_inactive specifies what to do if the type T is guaranteed to be inactive, use the primal (the default) or still copy the value.
Augment the primal return value of a function with its shadow, as well as any additional information needed to correctly compute the reverse pass, stored in tape.
Unless specified by the config that a variable is not overwritten, rules must assume any arrays/data structures/etc are overwritten between the forward and the reverse pass. Any floats or variables passed by value are always preserved as is (as are the arrays themselves, just not necessarily the values in the array).
Configuration type to dispatch on in custom reverse rules (see augmented_primal and reverse).
NeedsPrimal and NeedsShadow: boolean values specifying whether the primal and shadow (resp.) should be returned.
Width: an integer that specifies the number of adjoints/shadows simultaneously being propagated.
Overwritten: a tuple of booleans of whether each argument (including the function itself) is modified between the forward and reverse pass (true if potentially modified between).
RuntimeActivity: whether runtime activity is enabled.
Getters for the four type parameters are provided by needs_primal, needs_shadow, width, overwritten, and runtime_activity.
The primal must be the same type of the original return if needs_primal(config), otherwise nothing.
The shadow must be nothing if needs_shadow(config) is false. If width is 1, the shadow should be the same type of the original return. If the width is greater than 1, the shadow should be NTuple{original return, width}.
The tape can be any type (including Nothing) and is preserved for the reverse call.
Calculate the forward derivative. The first argument is a `FwdConfig object describing parameters of the differentiation. The second argument func is the callable for which the rule applies to. Either wrapped in a Const), or a Duplicated if it is a closure. The third argument is the return type annotation, and all other arguments are the annotated function arguments.
Mark a particular function as always being inactive in both its return result and the function call itself, but do not prevent inlining of the function.
Takes gradient of derivative, activity annotation, and tape. If there is an active return dret is passed as Active{T} with the derivative of the active return val. Otherwise dret is passed as Type{Duplicated{T}}, etc.
Augment the primal return value of a function with its shadow, as well as any additional information needed to correctly compute the reverse pass, stored in tape.
Unless specified by the config that a variable is not overwritten, rules must assume any arrays/data structures/etc are overwritten between the forward and the reverse pass. Any floats or variables passed by value are always preserved as is (as are the arrays themselves, just not necessarily the values in the array).
Configuration type to dispatch on in custom reverse rules (see augmented_primal and reverse).
NeedsPrimal and NeedsShadow: boolean values specifying whether the primal and shadow (resp.) should be returned.
Width: an integer that specifies the number of adjoints/shadows simultaneously being propagated.
Overwritten: a tuple of booleans of whether each argument (including the function itself) is modified between the forward and reverse pass (true if potentially modified between).
RuntimeActivity: whether runtime activity is enabled.
Getters for the four type parameters are provided by needs_primal, needs_shadow, width, overwritten, and runtime_activity.
The primal must be the same type of the original return if needs_primal(config), otherwise nothing.
The shadow must be nothing if needs_shadow(config) is false. If width is 1, the shadow should be the same type of the original return. If the width is greater than 1, the shadow should be NTuple{original return, width}.
The tape can be any type (including Nothing) and is preserved for the reverse call.
Calculate the forward derivative. The first argument is a `FwdConfig object describing parameters of the differentiation. The second argument func is the callable for which the rule applies to. Either wrapped in a Const), or a Duplicated if it is a closure. The third argument is the return type annotation, and all other arguments are the annotated function arguments.
Mark a particular function as always being inactive in both its return result and the function call itself, but do not prevent inlining of the function.
Takes gradient of derivative, activity annotation, and tape. If there is an active return dret is passed as Active{T} with the derivative of the active return val. Otherwise dret is passed as Type{Duplicated{T}}, etc.
This is per Test.@test condion kws... except that if it fails it also prints the msg. If msg=="" then this is just like @test, nothing is printed
Examles
julia> @test_msg "It is required that the total is under 10" sum(1:1000) < 10;
Test Failed at REPL[1]:1
Expression: sum(1:1000) < 10
Problem: It is required that the total is under 10
@@ -260,7 +260,7 @@
Test Failed at REPL[153]:1
Expression: sum(1:1000) < 10
Evaluated: 500500 < 10
- ERROR: There was an error during testing
Test Enzyme.autodiff of f in Forward-mode against finite differences.
f has all constraints of the same argument passed to Enzyme.autodiff, with additional constraints:
If it mutates one of its arguments, it must return that argument.
Arguments
Activity: the activity of the return value of f
args: Each entry is either an argument to f, an activity type accepted by autodiff, or a tuple of the form (arg, Activity), where Activity is the activity type of arg. If the activity type specified requires a tangent, a random tangent will be automatically generated.
Keywords
rng::AbstractRNG: The random number generator to use for generating random tangents.
fdm=FiniteDifferences.central_fdm(5, 1): The finite differences method to use.
fkwargs: Keyword arguments to pass to f.
rtol: Relative tolerance for isapprox.
atol: Absolute tolerance for isapprox.
testset_name: Name to use for a testset in which all tests are evaluated.
Examples
Here we test a rule for a function of scalars. Because we don't provide an activity annotation for y, it is assumed to be Const.
using Enzyme, EnzymeTestUtils
+ ERROR: There was an error during testing
Test Enzyme.autodiff of f in Forward-mode against finite differences.
f has all constraints of the same argument passed to Enzyme.autodiff, with additional constraints:
If it mutates one of its arguments, it must return that argument.
Arguments
Activity: the activity of the return value of f
args: Each entry is either an argument to f, an activity type accepted by autodiff, or a tuple of the form (arg, Activity), where Activity is the activity type of arg. If the activity type specified requires a tangent, a random tangent will be automatically generated.
Keywords
rng::AbstractRNG: The random number generator to use for generating random tangents.
fdm=FiniteDifferences.central_fdm(5, 1): The finite differences method to use.
fkwargs: Keyword arguments to pass to f.
rtol: Relative tolerance for isapprox.
atol: Absolute tolerance for isapprox.
testset_name: Name to use for a testset in which all tests are evaluated.
Examples
Here we test a rule for a function of scalars. Because we don't provide an activity annotation for y, it is assumed to be Const.
using Enzyme, EnzymeTestUtils
x, y = randn(2)
for Tret in (Const, Duplicated, DuplicatedNoNeed), Tx in (Const, Duplicated)
@@ -272,7 +272,7 @@
Ty in (Const, BatchDuplicated)
test_forward(*, Tret, (x, Tx), (y, Ty))
-end
Test Enzyme.autodiff_thunk of f in ReverseSplitWithPrimal-mode against finite differences.
f has all constraints of the same argument passed to Enzyme.autodiff_thunk, with additional constraints:
If an Array{<:AbstractFloat} appears in the input/output, then a reshaped version of it may not also appear in the input/output.
Arguments
Activity: the activity of the return value of f.
args: Each entry is either an argument to f, an activity type accepted by autodiff, or a tuple of the form (arg, Activity), where Activity is the activity type of arg. If the activity type specified requires a shadow, one will be automatically generated.
Keywords
rng::AbstractRNG: The random number generator to use for generating random tangents.
fdm=FiniteDifferences.central_fdm(5, 1): The finite differences method to use.
fkwargs: Keyword arguments to pass to f.
rtol: Relative tolerance for isapprox.
atol: Absolute tolerance for isapprox.
testset_name: Name to use for a testset in which all tests are evaluated.
Examples
Here we test a rule for a function of scalars. Because we don't provide an activity annotation for y, it is assumed to be Const.
Test Enzyme.autodiff_thunk of f in ReverseSplitWithPrimal-mode against finite differences.
f has all constraints of the same argument passed to Enzyme.autodiff_thunk, with additional constraints:
If an Array{<:AbstractFloat} appears in the input/output, then a reshaped version of it may not also appear in the input/output.
Arguments
Activity: the activity of the return value of f.
args: Each entry is either an argument to f, an activity type accepted by autodiff, or a tuple of the form (arg, Activity), where Activity is the activity type of arg. If the activity type specified requires a shadow, one will be automatically generated.
Keywords
rng::AbstractRNG: The random number generator to use for generating random tangents.
fdm=FiniteDifferences.central_fdm(5, 1): The finite differences method to use.
fkwargs: Keyword arguments to pass to f.
rtol: Relative tolerance for isapprox.
atol: Absolute tolerance for isapprox.
testset_name: Name to use for a testset in which all tests are evaluated.
Examples
Here we test a rule for a function of scalars. Because we don't provide an activity annotation for y, it is assumed to be Const.
using Enzyme, EnzymeTestUtils
x = randn()
y = randn()
@@ -281,4 +281,4 @@
end
Here we test a rule for a function of an array in batch reverse-mode:
x = randn(3)
for Tret in (Const, Active), Tx in (Const, BatchDuplicated)
test_reverse(prod, Tret, (x, Tx))
-end
Whether to inline all (non-recursive) functions generated by Julia within a single compilation unit. This may improve Enzyme's ability to successfully differentiate code and improve performance of the original and generated derivative program. It often, however, comes with an increase in compile time. This is off by default.
Enzyme runs a type analysis to deduce the corresponding types of all values being differentiated. This is necessary to compute correct derivatives of various values. For example, a copy of Float32's requires a different derivative than a memcpy of Float64's, Ptr's, etc. In some cases Enzyme may not be able to deduce all the types necessary and throw an unknown type error. If this is the case, open an issue. One can silence these issues by setting looseTypeAnalysis!(true) which tells Enzyme to make its best guess. This will remove the error and allow differentiation to continue, however, it may produce incorrect results. Alternatively one can consider increasing the space of the evaluated type lattice which gives Enzyme more time to run a more thorough analysis through the use of maxtypeoffset!
Enzyme runs a type analysis to deduce the corresponding types of all values being differentiated. This is necessary to compute correct derivatives of various values. To ensure this analysis temrinates, it operates on a finite lattice of possible states. This function sets the maximum depth into a type that Enzyme will consider. A smaller value will cause type analysis to run faster, but may result in some necessary types not being found and result in unknown type errors. A larger value may result in unknown type errors being resolved by searching a larger space, but may run longer. The default setting is 6.
Enzyme runs a type analysis to deduce the corresponding types of all values being differentiated. This is necessary to compute correct derivatives of various values. To ensure this analysis temrinates, it operates on a finite lattice of possible states. This function sets the maximum offset into a type that Enzyme will consider. A smaller value will cause type analysis to run faster, but may result in some necessary types not being found and result in unknown type errors. A larger value may result in unknown type errors being resolved by searching a larger space, but may run longer. The default setting is 512.
An debugging option for developers of Enzyme. If one sets this flag prior to the first differentiation of a function, Enzyme will print (to stderr) a log of all decisions made during Activity Analysis (the analysis which determines what values/instructions are differentiated). This may be useful for debugging MixedActivity errors, correctness, and performance errors. Off by default
An debugging option for developers of Enzyme. If one sets this flag prior to the first differentiation of a function, Enzyme will print (to stderr) the LLVM function being differentiated, as well as all generated derivatives immediately after running Enzyme (but prior to any other optimizations). Off by default
An debugging option for developers of Enzyme. If one sets this flag prior to the first differentiation of a function, Enzyme will print (to stderr) information about each LLVM value – specifically whether it and its shadow is required for computing the derivative. In contrast to printunnecessary!, this flag prints debug log for the analysis which determines for each value and shadow value, whether it can find a user which would require it to be kept around (rather than being deleted). This is prior to any cache optimizations and a debug log of Differential Use Analysis. This may be helpful for debugging caching, phi node deletion, performance, and other errors. Off by default
An debugging option for developers of Enzyme. If one sets this flag prior to the first differentiation of a function, Enzyme will print (to stderr) performance information about generated derivative programs. It will provide debug information that warns why particular values are cached for the reverse pass, and thus require additional computation/storage. This is particularly helpful for debugging derivatives which OOM or otherwise run slow. ff by default
An debugging option for developers of Enzyme. If one sets this flag prior to the first differentiation of a function, Enzyme will print (to stderr) a log of all decisions made during Type Analysis (the analysis which Enzyme determines the type of all values in the program). This may be useful for debugging correctness errors, illegal type analysis errors, insufficient type information errors, correctness, and performance errors. Off by default
An debugging option for developers of Enzyme. If one sets this flag prior to the first differentiation of a function, Enzyme will print (to stderr) information about each LLVM value – specifically whether it and its shadow is required for computing the derivative. In contrast to printdiffuse!, this flag prints the final results after running cache optimizations such as minCut (see Recompute vs Cache Heuristics from this paper and slides 31-33 from this presentation) for a description of the caching algorithm. This may be helpful for debugging caching, phi node deletion, performance, and other errors. Off by default
Whether Enzyme's type analysis will assume strict aliasing semantics. When strict aliasing semantics are on (the default), Enzyme can propagate type information up through conditional branches. This may lead to illegal type errors when analyzing code with unions. Disabling strict aliasing will enable these union types to be correctly analyzed. However, it may lead to some errors that sufficient type information cannot be deduced. One can turn these insufficient type information errors into to warnings by calling looseTypeAnalysis!(true) which tells Enzyme to use its best guess in such scenarios.
Whether to enforce multiplication by zero as enforcing a zero result even if multiplying against a NaN or infinity. Necessary for some programs in which a value has a zero derivative since it is unused, even if it has an otherwise infinite or nan derivative.
Whether to print a warning when Type Analysis learns informatoin about a value's type which cannot be represented in the current size of the lattice. See maxtypeoffset! for more information. Off by default.