From 09a963a17744891a8287aea95af7896cfe04e17f Mon Sep 17 00:00:00 2001 From: "Documenter.jl" Date: Thu, 5 Dec 2024 16:07:09 +0000 Subject: [PATCH] build based on 3abcee1 --- .../dev/.documenter-siteinfo.json | 2 +- DifferentiationInterface/dev/api/index.html | 66 +++++++++--------- .../dev/dev_guide/index.html | 2 +- .../dev/explanation/advanced/index.html | 2 +- .../dev/explanation/backends/index.html | 2 +- .../dev/explanation/faq/index.html | 2 +- .../dev/explanation/operators/index.html | 2 +- DifferentiationInterface/dev/index.html | 2 +- DifferentiationInterface/dev/objects.inv | Bin 2013 -> 2013 bytes .../dev/tutorials/advanced/index.html | 52 +++++++------- .../dev/tutorials/basic/index.html | 56 +++++++-------- 11 files changed, 94 insertions(+), 94 deletions(-) diff --git a/DifferentiationInterface/dev/.documenter-siteinfo.json b/DifferentiationInterface/dev/.documenter-siteinfo.json index fa9029086..985f62572 100644 --- a/DifferentiationInterface/dev/.documenter-siteinfo.json +++ b/DifferentiationInterface/dev/.documenter-siteinfo.json @@ -1 +1 @@ -{"documenter":{"julia_version":"1.11.2","generation_timestamp":"2024-12-05T13:23:14","documenter_version":"1.8.0"}} \ No newline at end of file +{"documenter":{"julia_version":"1.11.2","generation_timestamp":"2024-12-05T16:07:02","documenter_version":"1.8.0"}} \ No newline at end of file diff --git a/DifferentiationInterface/dev/api/index.html b/DifferentiationInterface/dev/api/index.html index cc1977fab..ba36b8464 100644 --- a/DifferentiationInterface/dev/api/index.html +++ b/DifferentiationInterface/dev/api/index.html @@ -1,5 +1,5 @@ -API · DifferentiationInterface.jl

API

Argument wrappers

DifferentiationInterface.ConstantType
Constant

Concrete type of Context argument which is kept constant during differentiation.

Note that an operator can be prepared with an arbitrary value of the constant. However, same-point preparation must occur with the exact value that will be reused later.

Example

julia> using DifferentiationInterface
+API · DifferentiationInterface.jl

API

Argument wrappers

DifferentiationInterface.ConstantType
Constant

Concrete type of Context argument which is kept constant during differentiation.

Note that an operator can be prepared with an arbitrary value of the constant. However, same-point preparation must occur with the exact value that will be reused later.

Example

julia> using DifferentiationInterface
 
 julia> import ForwardDiff
 
@@ -13,31 +13,31 @@
 julia> gradient(f, AutoForwardDiff(), [1.0, 2.0], Constant(100))
 2-element Vector{Float64}:
  200.0
- 400.0
source

First order

Pushforward

DifferentiationInterface.prepare_pushforwardFunction
prepare_pushforward(f,     backend, x, tx, [contexts...]) -> prep
-prepare_pushforward(f!, y, backend, x, tx, [contexts...]) -> prep

Create a prep object that can be given to pushforward and its variants.

Warning

If the function changes in any way, the result of preparation will be invalidated, and you will need to run it again. For in-place functions, y is mutated by f! during preparation.

source
DifferentiationInterface.prepare_pushforward_same_pointFunction
prepare_pushforward_same_point(f,     backend, x, tx, [contexts...]) -> prep_same
-prepare_pushforward_same_point(f!, y, backend, x, tx, [contexts...]) -> prep_same

Create an prep_same object that can be given to pushforward and its variants if they are applied at the same point x and with the same contexts.

Warning

If the function or the point changes in any way, the result of preparation will be invalidated, and you will need to run it again. For in-place functions, y is mutated by f! during preparation.

source
DifferentiationInterface.pushforwardFunction
pushforward(f,     [prep,] backend, x, tx, [contexts...]) -> ty
-pushforward(f!, y, [prep,] backend, x, tx, [contexts...]) -> ty

Compute the pushforward of the function f at point x with a tuple of tangents tx.

To improve performance via operator preparation, refer to prepare_pushforward and prepare_pushforward_same_point.

Tip

Pushforwards are also commonly called Jacobian-vector products or JVPs. This function could have been named jvp.

source
DifferentiationInterface.pushforward!Function
pushforward!(f,     dy, [prep,] backend, x, tx, [contexts...]) -> ty
-pushforward!(f!, y, dy, [prep,] backend, x, tx, [contexts...]) -> ty

Compute the pushforward of the function f at point x with a tuple of tangents tx, overwriting ty.

To improve performance via operator preparation, refer to prepare_pushforward and prepare_pushforward_same_point.

Tip

Pushforwards are also commonly called Jacobian-vector products or JVPs. This function could have been named jvp!.

source
DifferentiationInterface.value_and_pushforwardFunction
value_and_pushforward(f,     [prep,] backend, x, tx, [contexts...]) -> (y, ty)
-value_and_pushforward(f!, y, [prep,] backend, x, tx, [contexts...]) -> (y, ty)

Compute the value and the pushforward of the function f at point x with a tuple of tangents tx.

To improve performance via operator preparation, refer to prepare_pushforward and prepare_pushforward_same_point.

Tip

Pushforwards are also commonly called Jacobian-vector products or JVPs. This function could have been named value_and_jvp.

Info

Required primitive for forward mode backends.

source
DifferentiationInterface.value_and_pushforward!Function
value_and_pushforward!(f,     dy, [prep,] backend, x, tx, [contexts...]) -> (y, ty)
-value_and_pushforward!(f!, y, dy, [prep,] backend, x, tx, [contexts...]) -> (y, ty)

Compute the value and the pushforward of the function f at point x with a tuple of tangents tx, overwriting ty.

To improve performance via operator preparation, refer to prepare_pushforward and prepare_pushforward_same_point.

Tip

Pushforwards are also commonly called Jacobian-vector products or JVPs. This function could have been named value_and_jvp!.

source

Pullback

DifferentiationInterface.prepare_pullbackFunction
prepare_pullback(f,     backend, x, ty, [contexts...]) -> prep
-prepare_pullback(f!, y, backend, x, ty, [contexts...]) -> prep

Create a prep object that can be given to pullback and its variants.

Warning

If the function changes in any way, the result of preparation will be invalidated, and you will need to run it again. For in-place functions, y is mutated by f! during preparation.

source
DifferentiationInterface.prepare_pullback_same_pointFunction
prepare_pullback_same_point(f,     backend, x, ty, [contexts...]) -> prep_same
-prepare_pullback_same_point(f!, y, backend, x, ty, [contexts...]) -> prep_same

Create an prep_same object that can be given to pullback and its variants if they are applied at the same point x and with the same contexts.

Warning

If the function or the point changes in any way, the result of preparation will be invalidated, and you will need to run it again. For in-place functions, y is mutated by f! during preparation.

source
DifferentiationInterface.pullbackFunction
pullback(f,     [prep,] backend, x, ty, [contexts...]) -> tx
-pullback(f!, y, [prep,] backend, x, ty, [contexts...]) -> tx

Compute the pullback of the function f at point x with a tuple of tangents ty.

To improve performance via operator preparation, refer to prepare_pullback and prepare_pullback_same_point.

Tip

Pullbacks are also commonly called vector-Jacobian products or VJPs. This function could have been named vjp.

source
DifferentiationInterface.pullback!Function
pullback!(f,     dx, [prep,] backend, x, ty, [contexts...]) -> tx
-pullback!(f!, y, dx, [prep,] backend, x, ty, [contexts...]) -> tx

Compute the pullback of the function f at point x with a tuple of tangents ty, overwriting dx.

To improve performance via operator preparation, refer to prepare_pullback and prepare_pullback_same_point.

Tip

Pullbacks are also commonly called vector-Jacobian products or VJPs. This function could have been named vjp!.

source
DifferentiationInterface.value_and_pullbackFunction
value_and_pullback(f,     [prep,] backend, x, ty, [contexts...]) -> (y, tx)
-value_and_pullback(f!, y, [prep,] backend, x, ty, [contexts...]) -> (y, tx)

Compute the value and the pullback of the function f at point x with a tuple of tangents ty.

To improve performance via operator preparation, refer to prepare_pullback and prepare_pullback_same_point.

Tip

Pullbacks are also commonly called vector-Jacobian products or VJPs. This function could have been named value_and_vjp.

Info

Required primitive for reverse mode backends.

source
DifferentiationInterface.value_and_pullback!Function
value_and_pullback!(f,     dx, [prep,] backend, x, ty, [contexts...]) -> (y, tx)
-value_and_pullback!(f!, y, dx, [prep,] backend, x, ty, [contexts...]) -> (y, tx)

Compute the value and the pullback of the function f at point x with a tuple of tangents ty, overwriting dx.

To improve performance via operator preparation, refer to prepare_pullback and prepare_pullback_same_point.

Tip

Pullbacks are also commonly called vector-Jacobian products or VJPs. This function could have been named value_and_vjp!.

source

Derivative

DifferentiationInterface.prepare_derivativeFunction
prepare_derivative(f,     backend, x, [contexts...]) -> prep
-prepare_derivative(f!, y, backend, x, [contexts...]) -> prep

Create a prep object that can be given to derivative and its variants.

Warning

If the function changes in any way, the result of preparation will be invalidated, and you will need to run it again. For in-place functions, y is mutated by f! during preparation.

source
DifferentiationInterface.derivativeFunction
derivative(f,     [prep,] backend, x, [contexts...]) -> der
-derivative(f!, y, [prep,] backend, x, [contexts...]) -> der

Compute the derivative of the function f at point x.

To improve performance via operator preparation, refer to prepare_derivative.

source
DifferentiationInterface.derivative!Function
derivative!(f,     der, [prep,] backend, x, [contexts...]) -> der
-derivative!(f!, y, der, [prep,] backend, x, [contexts...]) -> der

Compute the derivative of the function f at point x, overwriting der.

To improve performance via operator preparation, refer to prepare_derivative.

source
DifferentiationInterface.value_and_derivativeFunction
value_and_derivative(f,     [prep,] backend, x, [contexts...]) -> (y, der)
-value_and_derivative(f!, y, [prep,] backend, x, [contexts...]) -> (y, der)

Compute the value and the derivative of the function f at point x.

To improve performance via operator preparation, refer to prepare_derivative.

source
DifferentiationInterface.value_and_derivative!Function
value_and_derivative!(f,     der, [prep,] backend, x, [contexts...]) -> (y, der)
-value_and_derivative!(f!, y, der, [prep,] backend, x, [contexts...]) -> (y, der)

Compute the value and the derivative of the function f at point x, overwriting der.

To improve performance via operator preparation, refer to prepare_derivative.

source

Gradient

DifferentiationInterface.prepare_gradientFunction
prepare_gradient(f, backend, x, [contexts...]) -> prep

Create a prep object that can be given to gradient and its variants.

Warning

If the function changes in any way, the result of preparation will be invalidated, and you will need to run it again.

source

Jacobian

DifferentiationInterface.prepare_jacobianFunction
prepare_jacobian(f,     backend, x, [contexts...]) -> prep
-prepare_jacobian(f!, y, backend, x, [contexts...]) -> prep

Create a prep object that can be given to jacobian and its variants.

Warning

If the function changes in any way, the result of preparation will be invalidated, and you will need to run it again. For in-place functions, y is mutated by f! during preparation.

source
DifferentiationInterface.jacobianFunction
jacobian(f,     [prep,] backend, x, [contexts...]) -> jac
-jacobian(f!, y, [prep,] backend, x, [contexts...]) -> jac

Compute the Jacobian matrix of the function f at point x.

To improve performance via operator preparation, refer to prepare_jacobian.

source
DifferentiationInterface.jacobian!Function
jacobian!(f,     jac, [prep,] backend, x, [contexts...]) -> jac
-jacobian!(f!, y, jac, [prep,] backend, x, [contexts...]) -> jac

Compute the Jacobian matrix of the function f at point x, overwriting jac.

To improve performance via operator preparation, refer to prepare_jacobian.

source
DifferentiationInterface.value_and_jacobianFunction
value_and_jacobian(f,     [prep,] backend, x, [contexts...]) -> (y, jac)
-value_and_jacobian(f!, y, [prep,] backend, x, [contexts...]) -> (y, jac)

Compute the value and the Jacobian matrix of the function f at point x.

To improve performance via operator preparation, refer to prepare_jacobian.

source
DifferentiationInterface.value_and_jacobian!Function
value_and_jacobian!(f,     jac, [prep,] backend, x, [contexts...]) -> (y, jac)
-value_and_jacobian!(f!, y, jac, [prep,] backend, x, [contexts...]) -> (y, jac)

Compute the value and the Jacobian matrix of the function f at point x, overwriting jac.

To improve performance via operator preparation, refer to prepare_jacobian.

source
DifferentiationInterface.MixedModeType
MixedMode

Combination of a forward and a reverse mode backend for mixed-mode Jacobian computation.

Danger

MixedMode backends only support jacobian and its variants.

Constructor

MixedMode(forward_backend, reverse_backend)
source

Second order

DifferentiationInterface.SecondOrderType
SecondOrder

Combination of two backends for second-order differentiation.

Danger

SecondOrder backends do not support first-order operators.

Constructor

SecondOrder(outer_backend, inner_backend)

Fields

  • outer::AbstractADType: backend for the outer differentiation
  • inner::AbstractADType: backend for the inner differentiation
source

Second derivative

Hessian-vector product

DifferentiationInterface.prepare_hvpFunction
prepare_hvp(f, backend, x, tx, [contexts...]) -> prep

Create a prep object that can be given to hvp and its variants.

Warning

If the function changes in any way, the result of preparation will be invalidated, and you will need to run it again.

source
DifferentiationInterface.prepare_hvp_same_pointFunction
prepare_hvp_same_point(f, backend, x, tx, [contexts...]) -> prep_same

Create an prep_same object that can be given to hvp and its variants if they are applied at the same point x and with the same contexts.

Warning

If the function or the point changes in any way, the result of preparation will be invalidated, and you will need to run it again.

source

Hessian

DifferentiationInterface.prepare_hessianFunction
prepare_hessian(f, backend, x, [contexts...]) -> prep

Create a prep object that can be given to hessian and its variants.

Warning

If the function changes in any way, the result of preparation will be invalidated, and you will need to run it again.

source

Utilities

Backend queries

DifferentiationInterface.outerFunction
outer(backend::SecondOrder)
-outer(backend::AbstractADType)

Return the outer backend of a SecondOrder object, tasked with differentiation at the second order.

For any other backend type, this function acts like the identity.

source
DifferentiationInterface.innerFunction
inner(backend::SecondOrder)
-inner(backend::AbstractADType)

Return the inner backend of a SecondOrder object, tasked with differentiation at the first order.

For any other backend type, this function acts like the identity.

source

Backend switch

DifferentiationInterface.DifferentiateWithType
DifferentiateWith

Function wrapper that enforces differentiation with a "substitute" AD backend, possible different from the "true" AD backend that is called.

For instance, suppose a function f is not differentiable with Zygote because it involves mutation, but you know that it is differentiable with Enzyme. Then f2 = DifferentiateWith(f, AutoEnzyme()) is a new function that behaves like f, except that f2 is differentiable with Zygote (thanks to a chain rule which calls Enzyme under the hood). Moreover, any larger algorithm alg that calls f2 instead of f will also be differentiable with Zygote (as long as f was the only Zygote blocker).

Tip

This is mainly relevant for package developers who want to produce differentiable code at low cost, without writing the differentiation rules themselves. If you sprinkle a few DifferentiateWith in places where some AD backends may struggle, end users can pick from a wider variety of packages to differentiate your algorithms.

Warning

DifferentiateWith only supports out-of-place functions y = f(x) without additional context arguments. It only makes these functions differentiable if the true backend is either ForwardDiff or compatible with ChainRules. For any other true backend, the differentiation behavior is not altered by DifferentiateWith (it becomes a transparent wrapper).

Fields

  • f: the function in question, with signature f(x)
  • backend::AbstractADType: the substitute backend to use for differentiation
Note

For the substitute AD backend to be called under the hood, its package needs to be loaded in addition to the package of the true AD backend.

Constructor

DifferentiateWith(f, backend)

Example

julia> using DifferentiationInterface
+ 400.0
source

First order

Pushforward

DifferentiationInterface.prepare_pushforwardFunction
prepare_pushforward(f,     backend, x, tx, [contexts...]) -> prep
+prepare_pushforward(f!, y, backend, x, tx, [contexts...]) -> prep

Create a prep object that can be given to pushforward and its variants.

Warning

If the function changes in any way, the result of preparation will be invalidated, and you will need to run it again. For in-place functions, y is mutated by f! during preparation.

source
DifferentiationInterface.prepare_pushforward_same_pointFunction
prepare_pushforward_same_point(f,     backend, x, tx, [contexts...]) -> prep_same
+prepare_pushforward_same_point(f!, y, backend, x, tx, [contexts...]) -> prep_same

Create an prep_same object that can be given to pushforward and its variants if they are applied at the same point x and with the same contexts.

Warning

If the function or the point changes in any way, the result of preparation will be invalidated, and you will need to run it again. For in-place functions, y is mutated by f! during preparation.

source
DifferentiationInterface.pushforwardFunction
pushforward(f,     [prep,] backend, x, tx, [contexts...]) -> ty
+pushforward(f!, y, [prep,] backend, x, tx, [contexts...]) -> ty

Compute the pushforward of the function f at point x with a tuple of tangents tx.

To improve performance via operator preparation, refer to prepare_pushforward and prepare_pushforward_same_point.

Tip

Pushforwards are also commonly called Jacobian-vector products or JVPs. This function could have been named jvp.

source
DifferentiationInterface.pushforward!Function
pushforward!(f,     dy, [prep,] backend, x, tx, [contexts...]) -> ty
+pushforward!(f!, y, dy, [prep,] backend, x, tx, [contexts...]) -> ty

Compute the pushforward of the function f at point x with a tuple of tangents tx, overwriting ty.

To improve performance via operator preparation, refer to prepare_pushforward and prepare_pushforward_same_point.

Tip

Pushforwards are also commonly called Jacobian-vector products or JVPs. This function could have been named jvp!.

source
DifferentiationInterface.value_and_pushforwardFunction
value_and_pushforward(f,     [prep,] backend, x, tx, [contexts...]) -> (y, ty)
+value_and_pushforward(f!, y, [prep,] backend, x, tx, [contexts...]) -> (y, ty)

Compute the value and the pushforward of the function f at point x with a tuple of tangents tx.

To improve performance via operator preparation, refer to prepare_pushforward and prepare_pushforward_same_point.

Tip

Pushforwards are also commonly called Jacobian-vector products or JVPs. This function could have been named value_and_jvp.

Info

Required primitive for forward mode backends.

source
DifferentiationInterface.value_and_pushforward!Function
value_and_pushforward!(f,     dy, [prep,] backend, x, tx, [contexts...]) -> (y, ty)
+value_and_pushforward!(f!, y, dy, [prep,] backend, x, tx, [contexts...]) -> (y, ty)

Compute the value and the pushforward of the function f at point x with a tuple of tangents tx, overwriting ty.

To improve performance via operator preparation, refer to prepare_pushforward and prepare_pushforward_same_point.

Tip

Pushforwards are also commonly called Jacobian-vector products or JVPs. This function could have been named value_and_jvp!.

source

Pullback

DifferentiationInterface.prepare_pullbackFunction
prepare_pullback(f,     backend, x, ty, [contexts...]) -> prep
+prepare_pullback(f!, y, backend, x, ty, [contexts...]) -> prep

Create a prep object that can be given to pullback and its variants.

Warning

If the function changes in any way, the result of preparation will be invalidated, and you will need to run it again. For in-place functions, y is mutated by f! during preparation.

source
DifferentiationInterface.prepare_pullback_same_pointFunction
prepare_pullback_same_point(f,     backend, x, ty, [contexts...]) -> prep_same
+prepare_pullback_same_point(f!, y, backend, x, ty, [contexts...]) -> prep_same

Create an prep_same object that can be given to pullback and its variants if they are applied at the same point x and with the same contexts.

Warning

If the function or the point changes in any way, the result of preparation will be invalidated, and you will need to run it again. For in-place functions, y is mutated by f! during preparation.

source
DifferentiationInterface.pullbackFunction
pullback(f,     [prep,] backend, x, ty, [contexts...]) -> tx
+pullback(f!, y, [prep,] backend, x, ty, [contexts...]) -> tx

Compute the pullback of the function f at point x with a tuple of tangents ty.

To improve performance via operator preparation, refer to prepare_pullback and prepare_pullback_same_point.

Tip

Pullbacks are also commonly called vector-Jacobian products or VJPs. This function could have been named vjp.

source
DifferentiationInterface.pullback!Function
pullback!(f,     dx, [prep,] backend, x, ty, [contexts...]) -> tx
+pullback!(f!, y, dx, [prep,] backend, x, ty, [contexts...]) -> tx

Compute the pullback of the function f at point x with a tuple of tangents ty, overwriting dx.

To improve performance via operator preparation, refer to prepare_pullback and prepare_pullback_same_point.

Tip

Pullbacks are also commonly called vector-Jacobian products or VJPs. This function could have been named vjp!.

source
DifferentiationInterface.value_and_pullbackFunction
value_and_pullback(f,     [prep,] backend, x, ty, [contexts...]) -> (y, tx)
+value_and_pullback(f!, y, [prep,] backend, x, ty, [contexts...]) -> (y, tx)

Compute the value and the pullback of the function f at point x with a tuple of tangents ty.

To improve performance via operator preparation, refer to prepare_pullback and prepare_pullback_same_point.

Tip

Pullbacks are also commonly called vector-Jacobian products or VJPs. This function could have been named value_and_vjp.

Info

Required primitive for reverse mode backends.

source
DifferentiationInterface.value_and_pullback!Function
value_and_pullback!(f,     dx, [prep,] backend, x, ty, [contexts...]) -> (y, tx)
+value_and_pullback!(f!, y, dx, [prep,] backend, x, ty, [contexts...]) -> (y, tx)

Compute the value and the pullback of the function f at point x with a tuple of tangents ty, overwriting dx.

To improve performance via operator preparation, refer to prepare_pullback and prepare_pullback_same_point.

Tip

Pullbacks are also commonly called vector-Jacobian products or VJPs. This function could have been named value_and_vjp!.

source

Derivative

DifferentiationInterface.prepare_derivativeFunction
prepare_derivative(f,     backend, x, [contexts...]) -> prep
+prepare_derivative(f!, y, backend, x, [contexts...]) -> prep

Create a prep object that can be given to derivative and its variants.

Warning

If the function changes in any way, the result of preparation will be invalidated, and you will need to run it again. For in-place functions, y is mutated by f! during preparation.

source
DifferentiationInterface.derivativeFunction
derivative(f,     [prep,] backend, x, [contexts...]) -> der
+derivative(f!, y, [prep,] backend, x, [contexts...]) -> der

Compute the derivative of the function f at point x.

To improve performance via operator preparation, refer to prepare_derivative.

source
DifferentiationInterface.derivative!Function
derivative!(f,     der, [prep,] backend, x, [contexts...]) -> der
+derivative!(f!, y, der, [prep,] backend, x, [contexts...]) -> der

Compute the derivative of the function f at point x, overwriting der.

To improve performance via operator preparation, refer to prepare_derivative.

source
DifferentiationInterface.value_and_derivativeFunction
value_and_derivative(f,     [prep,] backend, x, [contexts...]) -> (y, der)
+value_and_derivative(f!, y, [prep,] backend, x, [contexts...]) -> (y, der)

Compute the value and the derivative of the function f at point x.

To improve performance via operator preparation, refer to prepare_derivative.

source
DifferentiationInterface.value_and_derivative!Function
value_and_derivative!(f,     der, [prep,] backend, x, [contexts...]) -> (y, der)
+value_and_derivative!(f!, y, der, [prep,] backend, x, [contexts...]) -> (y, der)

Compute the value and the derivative of the function f at point x, overwriting der.

To improve performance via operator preparation, refer to prepare_derivative.

source

Gradient

DifferentiationInterface.prepare_gradientFunction
prepare_gradient(f, backend, x, [contexts...]) -> prep

Create a prep object that can be given to gradient and its variants.

Warning

If the function changes in any way, the result of preparation will be invalidated, and you will need to run it again.

source

Jacobian

DifferentiationInterface.prepare_jacobianFunction
prepare_jacobian(f,     backend, x, [contexts...]) -> prep
+prepare_jacobian(f!, y, backend, x, [contexts...]) -> prep

Create a prep object that can be given to jacobian and its variants.

Warning

If the function changes in any way, the result of preparation will be invalidated, and you will need to run it again. For in-place functions, y is mutated by f! during preparation.

source
DifferentiationInterface.jacobianFunction
jacobian(f,     [prep,] backend, x, [contexts...]) -> jac
+jacobian(f!, y, [prep,] backend, x, [contexts...]) -> jac

Compute the Jacobian matrix of the function f at point x.

To improve performance via operator preparation, refer to prepare_jacobian.

source
DifferentiationInterface.jacobian!Function
jacobian!(f,     jac, [prep,] backend, x, [contexts...]) -> jac
+jacobian!(f!, y, jac, [prep,] backend, x, [contexts...]) -> jac

Compute the Jacobian matrix of the function f at point x, overwriting jac.

To improve performance via operator preparation, refer to prepare_jacobian.

source
DifferentiationInterface.value_and_jacobianFunction
value_and_jacobian(f,     [prep,] backend, x, [contexts...]) -> (y, jac)
+value_and_jacobian(f!, y, [prep,] backend, x, [contexts...]) -> (y, jac)

Compute the value and the Jacobian matrix of the function f at point x.

To improve performance via operator preparation, refer to prepare_jacobian.

source
DifferentiationInterface.value_and_jacobian!Function
value_and_jacobian!(f,     jac, [prep,] backend, x, [contexts...]) -> (y, jac)
+value_and_jacobian!(f!, y, jac, [prep,] backend, x, [contexts...]) -> (y, jac)

Compute the value and the Jacobian matrix of the function f at point x, overwriting jac.

To improve performance via operator preparation, refer to prepare_jacobian.

source
DifferentiationInterface.MixedModeType
MixedMode

Combination of a forward and a reverse mode backend for mixed-mode Jacobian computation.

Danger

MixedMode backends only support jacobian and its variants.

Constructor

MixedMode(forward_backend, reverse_backend)
source

Second order

DifferentiationInterface.SecondOrderType
SecondOrder

Combination of two backends for second-order differentiation.

Danger

SecondOrder backends do not support first-order operators.

Constructor

SecondOrder(outer_backend, inner_backend)

Fields

  • outer::AbstractADType: backend for the outer differentiation
  • inner::AbstractADType: backend for the inner differentiation
source

Second derivative

Hessian-vector product

DifferentiationInterface.prepare_hvpFunction
prepare_hvp(f, backend, x, tx, [contexts...]) -> prep

Create a prep object that can be given to hvp and its variants.

Warning

If the function changes in any way, the result of preparation will be invalidated, and you will need to run it again.

source
DifferentiationInterface.prepare_hvp_same_pointFunction
prepare_hvp_same_point(f, backend, x, tx, [contexts...]) -> prep_same

Create an prep_same object that can be given to hvp and its variants if they are applied at the same point x and with the same contexts.

Warning

If the function or the point changes in any way, the result of preparation will be invalidated, and you will need to run it again.

source

Hessian

DifferentiationInterface.prepare_hessianFunction
prepare_hessian(f, backend, x, [contexts...]) -> prep

Create a prep object that can be given to hessian and its variants.

Warning

If the function changes in any way, the result of preparation will be invalidated, and you will need to run it again.

source

Utilities

Backend queries

DifferentiationInterface.outerFunction
outer(backend::SecondOrder)
+outer(backend::AbstractADType)

Return the outer backend of a SecondOrder object, tasked with differentiation at the second order.

For any other backend type, this function acts like the identity.

source
DifferentiationInterface.innerFunction
inner(backend::SecondOrder)
+inner(backend::AbstractADType)

Return the inner backend of a SecondOrder object, tasked with differentiation at the first order.

For any other backend type, this function acts like the identity.

source

Backend switch

DifferentiationInterface.DifferentiateWithType
DifferentiateWith

Function wrapper that enforces differentiation with a "substitute" AD backend, possible different from the "true" AD backend that is called.

For instance, suppose a function f is not differentiable with Zygote because it involves mutation, but you know that it is differentiable with Enzyme. Then f2 = DifferentiateWith(f, AutoEnzyme()) is a new function that behaves like f, except that f2 is differentiable with Zygote (thanks to a chain rule which calls Enzyme under the hood). Moreover, any larger algorithm alg that calls f2 instead of f will also be differentiable with Zygote (as long as f was the only Zygote blocker).

Tip

This is mainly relevant for package developers who want to produce differentiable code at low cost, without writing the differentiation rules themselves. If you sprinkle a few DifferentiateWith in places where some AD backends may struggle, end users can pick from a wider variety of packages to differentiate your algorithms.

Warning

DifferentiateWith only supports out-of-place functions y = f(x) without additional context arguments. It only makes these functions differentiable if the true backend is either ForwardDiff or compatible with ChainRules. For any other true backend, the differentiation behavior is not altered by DifferentiateWith (it becomes a transparent wrapper).

Fields

  • f: the function in question, with signature f(x)
  • backend::AbstractADType: the substitute backend to use for differentiation
Note

For the substitute AD backend to be called under the hood, its package needs to be loaded in addition to the package of the true AD backend.

Constructor

DifferentiateWith(f, backend)

Example

julia> using DifferentiationInterface
 
 julia> import FiniteDiff, ForwardDiff, Zygote
 
@@ -62,7 +62,7 @@
 julia> Zygote.gradient(alg, [3.0, 5.0])[1]
 2-element Vector{Float64}:
  42.0
- 70.0
source

Sparsity detection

DifferentiationInterface.DenseSparsityDetectorType
DenseSparsityDetector

Sparsity pattern detector satisfying the detection API of ADTypes.jl.

The nonzeros in a Jacobian or Hessian are detected by computing the relevant matrix with dense AD, and thresholding the entries with a given tolerance (which can be numerically inaccurate). This process can be very slow, and should only be used if its output can be exploited multiple times to compute many sparse matrices.

Danger

In general, the sparsity pattern you obtain can depend on the provided input x. If you want to reuse the pattern, make sure that it is input-agnostic.

Warning

DenseSparsityDetector functionality is now located in a package extension, please load the SparseArrays.jl standard library before you use it.

Fields

  • backend::AbstractADType is the dense AD backend used under the hood
  • atol::Float64 is the minimum magnitude of a matrix entry to be considered nonzero

Constructor

DenseSparsityDetector(backend; atol, method=:iterative)

The keyword argument method::Symbol can be either:

  • :iterative: compute the matrix in a sequence of matrix-vector products (memory-efficient)
  • :direct: compute the matrix all at once (memory-hungry but sometimes faster).

Note that the constructor is type-unstable because method ends up being a type parameter of the DenseSparsityDetector object (this is not part of the API and might change).

Examples

using ADTypes, DifferentiationInterface, SparseArrays
+ 70.0
source

Sparsity detection

DifferentiationInterface.DenseSparsityDetectorType
DenseSparsityDetector

Sparsity pattern detector satisfying the detection API of ADTypes.jl.

The nonzeros in a Jacobian or Hessian are detected by computing the relevant matrix with dense AD, and thresholding the entries with a given tolerance (which can be numerically inaccurate). This process can be very slow, and should only be used if its output can be exploited multiple times to compute many sparse matrices.

Danger

In general, the sparsity pattern you obtain can depend on the provided input x. If you want to reuse the pattern, make sure that it is input-agnostic.

Warning

DenseSparsityDetector functionality is now located in a package extension, please load the SparseArrays.jl standard library before you use it.

Fields

  • backend::AbstractADType is the dense AD backend used under the hood
  • atol::Float64 is the minimum magnitude of a matrix entry to be considered nonzero

Constructor

DenseSparsityDetector(backend; atol, method=:iterative)

The keyword argument method::Symbol can be either:

  • :iterative: compute the matrix in a sequence of matrix-vector products (memory-efficient)
  • :direct: compute the matrix all at once (memory-hungry but sometimes faster).

Note that the constructor is type-unstable because method ends up being a type parameter of the DenseSparsityDetector object (this is not part of the API and might change).

Examples

using ADTypes, DifferentiationInterface, SparseArrays
 import ForwardDiff
 
 detector = DenseSparsityDetector(AutoForwardDiff(); atol=1e-5, method=:direct)
@@ -85,13 +85,13 @@
 # output
 
 1×2 SparseMatrixCSC{Bool, Int64} with 1 stored entry:
- 1  ⋅
source

Internals

The following is not part of the public API.

DifferentiationInterface.AutoSimpleFiniteDiffType
AutoSimpleFiniteDiff <: ADTypes.AbstractADType

Forward mode backend based on the finite difference (f(x + ε) - f(x)) / ε, with artificial chunk size to mimick ForwardDiff.

Constructor

AutoSimpleFiniteDiff(ε=1e-5; chunksize=nothing)
source
DifferentiationInterface.BatchSizeSettingsType
BatchSizeSettings{B,singlebatch,aligned}

Configuration for the batch size deduced from a backend and a sample array of length N.

Type parameters

  • B::Int: batch size
  • singlebatch::Bool: whether B == N (B > N is not allowed)
  • aligned::Bool: whether N % B == 0

Fields

  • N::Int: array length
  • A::Int: number of batches A = div(N, B, RoundUp)
  • B_last::Int: size of the last batch (if aligned is false)
source
ADTypes.modeMethod
mode(backend::SecondOrder)

Return the outer mode of the second-order backend.

source
DifferentiationInterface.basisMethod
basis(backend, a::AbstractArray, i)

Construct the i-th standard basis array in the vector space of a with element type eltype(a).

Note

If an AD backend benefits from a more specialized basis array implementation, this function can be extended on the backend type.

source
DifferentiationInterface.multibasisMethod
multibasis(backend, a::AbstractArray, inds::AbstractVector)

Construct the sum of the i-th standard basis arrays in the vector space of a with element type eltype(a), for all i ∈ inds.

Note

If an AD backend benefits from a more specialized basis array implementation, this function can be extended on the backend type.

source
DifferentiationInterface.prepare!_derivativeFunction
prepare!_derivative(f,     prep, backend, x, [contexts...]) -> new_prep
-prepare!_derivative(f!, y, prep, backend, x, [contexts...]) -> new_prep

Same behavior as prepare_derivative but can modify an existing prep object to avoid some allocations.

There is no guarantee that prep will be mutated, or that performance will be improved compared to preparation from scratch.

Danger

For efficiency, this function needs to rely on backend package internals, therefore it not protected by semantic versioning.

source
DifferentiationInterface.prepare!_gradientFunction
prepare!_gradient(f, prep, backend, x, [contexts...]) -> new_prep

Same behavior as prepare_gradient but can modify an existing prep object to avoid some allocations.

There is no guarantee that prep will be mutated, or that performance will be improved compared to preparation from scratch.

Danger

For efficiency, this function needs to rely on backend package internals, therefore it not protected by semantic versioning.

source
DifferentiationInterface.prepare!_hessianFunction
prepare!_hessian(f, backend, x, [contexts...]) -> new_prep

Same behavior as prepare_hessian but can modify an existing prep object to avoid some allocations.

There is no guarantee that prep will be mutated, or that performance will be improved compared to preparation from scratch.

Danger

For efficiency, this function needs to rely on backend package internals, therefore it not protected by semantic versioning.

source
DifferentiationInterface.prepare!_hvpFunction
prepare!_hvp(f, backend, x, tx, [contexts...]) -> new_prep

Same behavior as prepare_hvp but can modify an existing prep object to avoid some allocations.

There is no guarantee that prep will be mutated, or that performance will be improved compared to preparation from scratch.

Danger

For efficiency, this function needs to rely on backend package internals, therefore it not protected by semantic versioning.

source
DifferentiationInterface.prepare!_jacobianFunction
prepare!_jacobian(f,     prep, backend, x, [contexts...]) -> new_prep
-prepare!_jacobian(f!, y, prep, backend, x, [contexts...]) -> new_prep

Same behavior as prepare_jacobian but can modify an existing prep object to avoid some allocations.

There is no guarantee that prep will be mutated, or that performance will be improved compared to preparation from scratch.

Danger

For efficiency, this function needs to rely on backend package internals, therefore it not protected by semantic versioning.

source
DifferentiationInterface.prepare!_pullbackFunction
prepare!_pullback(f,     prep, backend, x, ty, [contexts...]) -> new_prep
-prepare!_pullback(f!, y, prep, backend, x, ty, [contexts...]) -> new_prep

Same behavior as prepare_pullback but can modify an existing prep object to avoid some allocations.

There is no guarantee that prep will be mutated, or that performance will be improved compared to preparation from scratch.

Danger

For efficiency, this function needs to rely on backend package internals, therefore it not protected by semantic versioning.

source
DifferentiationInterface.prepare!_pushforwardFunction
prepare!_pushforward(f,     prep, backend, x, tx, [contexts...]) -> new_prep
-prepare!_pushforward(f!, y, prep, backend, x, tx, [contexts...]) -> new_prep

Same behavior as prepare_pushforward but can modify an existing prep object to avoid some allocations.

There is no guarantee that prep will be mutated, or that performance will be improved compared to preparation from scratch.

Danger

For efficiency, this function needs to rely on backend package internals, therefore it not protected by semantic versioning.

source
+
diff --git a/DifferentiationInterface/dev/dev_guide/index.html b/DifferentiationInterface/dev/dev_guide/index.html index 5d8b036ac..261f4d4aa 100644 --- a/DifferentiationInterface/dev/dev_guide/index.html +++ b/DifferentiationInterface/dev/dev_guide/index.html @@ -4,4 +4,4 @@ startOnLoad: true, theme: "neutral" }); - + diff --git a/DifferentiationInterface/dev/explanation/advanced/index.html b/DifferentiationInterface/dev/explanation/advanced/index.html index 0bef5af04..5de08e1a8 100644 --- a/DifferentiationInterface/dev/explanation/advanced/index.html +++ b/DifferentiationInterface/dev/explanation/advanced/index.html @@ -5,4 +5,4 @@ startOnLoad: true, theme: "neutral" }); - + diff --git a/DifferentiationInterface/dev/explanation/backends/index.html b/DifferentiationInterface/dev/explanation/backends/index.html index 59d8d0252..ff8ff6b97 100644 --- a/DifferentiationInterface/dev/explanation/backends/index.html +++ b/DifferentiationInterface/dev/explanation/backends/index.html @@ -4,4 +4,4 @@ startOnLoad: true, theme: "neutral" }); - + diff --git a/DifferentiationInterface/dev/explanation/faq/index.html b/DifferentiationInterface/dev/explanation/faq/index.html index badff6c0c..d58bd9e3d 100644 --- a/DifferentiationInterface/dev/explanation/faq/index.html +++ b/DifferentiationInterface/dev/explanation/faq/index.html @@ -4,4 +4,4 @@ startOnLoad: true, theme: "neutral" }); - + diff --git a/DifferentiationInterface/dev/explanation/operators/index.html b/DifferentiationInterface/dev/explanation/operators/index.html index 536961794..4d803b078 100644 --- a/DifferentiationInterface/dev/explanation/operators/index.html +++ b/DifferentiationInterface/dev/explanation/operators/index.html @@ -5,4 +5,4 @@ startOnLoad: true, theme: "neutral" }); - + diff --git a/DifferentiationInterface/dev/index.html b/DifferentiationInterface/dev/index.html index 1d009ad4b..8357edf93 100644 --- a/DifferentiationInterface/dev/index.html +++ b/DifferentiationInterface/dev/index.html @@ -20,4 +20,4 @@ startOnLoad: true, theme: "neutral" }); - + diff --git a/DifferentiationInterface/dev/objects.inv b/DifferentiationInterface/dev/objects.inv index c5f77517ec81ef665d8cfcfdbade0e9d02147515..5ce23aed45d9258e701c45c295d36ee2508e4b1d 100644 GIT binary patch delta 12 Tcmcc1f0utk2&37?&<=J0An*i< delta 12 Tcmcc1f0utk2&3u7&<=J0AnOE( diff --git a/DifferentiationInterface/dev/tutorials/advanced/index.html b/DifferentiationInterface/dev/tutorials/advanced/index.html index 3e42475e0..360543e3b 100644 --- a/DifferentiationInterface/dev/tutorials/advanced/index.html +++ b/DifferentiationInterface/dev/tutorials/advanced/index.html @@ -88,34 +88,34 @@ [2, 4] [5, 7] [6, 8]

Sparsity speedup

When preparation is used, the speedup due to sparsity becomes very visible in large dimensions.

xbig = rand(1000)
jac_prep_dense = prepare_jacobian(f_sparse_vector, dense_first_order_backend, zero(xbig))
-@benchmark jacobian($f_sparse_vector, $jac_prep_dense, $dense_first_order_backend, $xbig)
BenchmarkTools.Trial: 429 samples with 1 evaluation.
- Range (minmax):   4.957 ms203.638 ms   GC (min … max):  9.71% … 97.37%
- Time  (median):      6.099 ms                GC (median):    16.29%
- Time  (mean ± σ):   11.627 ms ±  27.448 ms   GC (mean ± σ):  47.64% ± 18.08%
+@benchmark jacobian($f_sparse_vector, $jac_prep_dense, $dense_first_order_backend, $xbig)
BenchmarkTools.Trial: 504 samples with 1 evaluation.
+ Range (minmax):  4.649 ms167.853 ms   GC (min … max):  9.96% … 96.97%
+ Time  (median):     5.279 ms                GC (median):    16.04%
+ Time  (mean ± σ):   9.895 ms ±  24.193 ms   GC (mean ± σ):  50.19% ± 18.67%
 
-                                                             
-  ▇▄▁▅▅▄▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▅▄▁▁▁▁▄▅ ▆
-  4.96 ms       Histogram: log(frequency) by time       182 ms <
+                                                              
+  ▁▁▁▁▄▇▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▆▆ ▆
+  4.65 ms      Histogram: log(frequency) by time       148 ms <
 
  Memory estimate: 57.63 MiB, allocs estimate: 1515.
jac_prep_sparse = prepare_jacobian(f_sparse_vector, sparse_first_order_backend, zero(xbig))
 @benchmark jacobian($f_sparse_vector, $jac_prep_sparse, $sparse_first_order_backend, $xbig)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
- Range (minmax):  22.612 μs 2.669 ms   GC (min … max):  0.00% … 96.69%
- Time  (median):     30.387 μs               GC (median):     0.00%
- Time  (mean ± σ):   38.160 μs ± 80.587 μs   GC (mean ± σ):  13.51% ±  7.15%
+ Range (minmax):  22.211 μs 1.746 ms   GC (min … max):  0.00% … 93.70%
+ Time  (median):     28.203 μs               GC (median):     0.00%
+ Time  (mean ± σ):   32.248 μs ± 53.987 μs   GC (mean ± σ):  11.90% ±  7.25%
 
-   ▅▇█▆▅▂▂▂▂▁▂▂▂▂▁▁                                         ▂
-  ▆█████████████████▇▇▅▅▅▁▅▄▄▃▄▁▄▃▃▄▁▁▁▁▁▁▁▁▁▃▁▁▁▃▁▁▁▁▃▁▁▃█ █
-  22.6 μs      Histogram: log(frequency) by time       136 μs <
+         ▂▄▆▆▄▂▁    ▁▅█▇▆▂▃▁                                   
+  ▁▂▃▄▅▆██████████▇████████▅▃▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▃
+  22.2 μs         Histogram: frequency by time        43.4 μs <
 
  Memory estimate: 305.31 KiB, allocs estimate: 27.

Better memory use can be achieved by pre-allocating the matrix from the preparation result (so that it has the correct structure).

jac_buffer = similar(sparsity_pattern(jac_prep_sparse), eltype(xbig))
 @benchmark jacobian!($f_sparse_vector, $jac_buffer, $jac_prep_sparse, $sparse_first_order_backend, $xbig)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
- Range (minmax):  18.915 μs 3.923 ms   GC (min … max):  0.00% … 97.78%
- Time  (median):     26.239 μs               GC (median):     0.00%
- Time  (mean ± σ):   31.886 μs ± 67.951 μs   GC (mean ± σ):  10.17% ±  5.94%
+ Range (minmax):  19.086 μs 1.433 ms   GC (min … max): 0.00% … 95.21%
+ Time  (median):     24.255 μs               GC (median):    0.00%
+ Time  (mean ± σ):   26.712 μs ± 43.284 μs   GC (mean ± σ):  9.16% ±  5.79%
 
-  ▁▄▆▇█▄▂▂▃▃▃▂▃▂▂▁                                          ▂
-  █████████████████▇▆▆▃▃▄▃▃▃▄▄▁▁▄▄▃▁▁▁▁▄▃▁▁▁▃▁▁▃▁▃▁▁▁▁▄▃▅██ █
-  18.9 μs      Histogram: log(frequency) by time       110 μs <
+              ▁▁▁▁▁▆▇█▄                                       
+  ▁▁▁▂▂▃▄▅▇█████████████▅▃▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▃
+  19.1 μs         Histogram: frequency by time        36.7 μs <
 
  Memory estimate: 234.75 KiB, allocs estimate: 18.

And for optimal speed, one should write non-allocating and type-stable functions.

function f_sparse_vector!(y::AbstractVector, x::AbstractVector)
     n = length(x)
@@ -130,17 +130,17 @@
 ybig ≈ f_sparse_vector(xbig)
true

In this case, the sparse Jacobian should also become non-allocating (for our specific choice of backend).

jac_prep_sparse_nonallocating = prepare_jacobian(f_sparse_vector!, zero(ybig), sparse_first_order_backend, zero(xbig))
 jac_buffer = similar(sparsity_pattern(jac_prep_sparse_nonallocating), eltype(xbig))
 @benchmark jacobian!($f_sparse_vector!, $ybig, $jac_buffer, $jac_prep_sparse_nonallocating, $sparse_first_order_backend, $xbig)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
- Range (minmax):  14.026 μs46.026 μs   GC (min … max): 0.00% … 0.00%
- Time  (median):     14.257 μs               GC (median):    0.00%
- Time  (mean ± σ):   14.439 μs ±  1.162 μs   GC (mean ± σ):  0.00% ± 0.00%
+ Range (minmax):  13.044 μs 30.848 μs   GC (min … max): 0.00% … 0.00%
+ Time  (median):     13.235 μs                GC (median):    0.00%
+ Time  (mean ± σ):   13.298 μs ± 656.367 ns   GC (mean ± σ):  0.00% ± 0.00%
 
-  ▃▇█▂▁▁▁▁                                                 ▂
-  ██████████▇██▇▇▇▆▆▅▅▅▅▅▄▄▅▇▇▆▇▆▄▅▅▅▆▅▅▅▅▄▄▁▄▁▄▁▁▁▄▁▁▁▅▄▃ █
-  14 μs        Histogram: log(frequency) by time        19 μs <
+     ▂▄█                                                      
+  ▂▂▆███▄▃▂▂▁▂▂▂▂▂▂▂▂▂▂▂▂▁▂▂▂▂▂▂▁▂▂▁▂▂▂▂▂▁▂▂▁▂▂▂▂▂▁▂▁▂▂▂▂▂▂ ▃
+  13 μs           Histogram: frequency by time         15.1 μs <
 
  Memory estimate: 0 bytes, allocs estimate: 0.
+ diff --git a/DifferentiationInterface/dev/tutorials/basic/index.html b/DifferentiationInterface/dev/tutorials/basic/index.html index 64b1d2ea7..d646ac310 100644 --- a/DifferentiationInterface/dev/tutorials/basic/index.html +++ b/DifferentiationInterface/dev/tutorials/basic/index.html @@ -13,14 +13,14 @@ 8.0 10.0

Was that fast? BenchmarkTools.jl helps you answer that question.

using BenchmarkTools
 
-@benchmark gradient($f, $backend, $x)
BenchmarkTools.Trial: 10000 samples with 204 evaluations.
- Range (minmax):  385.770 ns130.601 μs   GC (min … max):  0.00% … 99.29%
- Time  (median):     593.365 ns                GC (median):     0.00%
- Time  (mean ± σ):   587.666 ns ±   2.460 μs   GC (mean ± σ):  10.56% ±  3.22%
+@benchmark gradient($f, $backend, $x)
BenchmarkTools.Trial: 10000 samples with 200 evaluations.
+ Range (minmax):  411.520 ns134.991 μs   GC (min … max): 0.00% … 99.50%
+ Time  (median):     617.832 ns                GC (median):    0.00%
+ Time  (mean ± σ):   614.994 ns ±   2.441 μs   GC (mean ± σ):  9.39% ±  2.93%
 
-    █▁                                                           
-  ▂▆██▅▃▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▂▂▂▂▂▁▂▂▁▂▃██▆▅▆▅▄▄▃▃▃▃▂▂▂▂▂▂▂▂▂ ▃
-  386 ns           Histogram: frequency by time          707 ns <
+   █▆                        ▃▁▂                                
+  ▄██▅▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▂▂▂▃▇███▆▃▃▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▁▁▂▁▁▂▂▂▂ ▃
+  412 ns           Histogram: frequency by time          874 ns <
 
  Memory estimate: 624 bytes, allocs estimate: 5.

Not bad, but you can do better.

Overwriting a gradient

Since you know how much space your gradient will occupy (the same as your input x), you can pre-allocate that memory and offer it to AD. Some backends get a speed boost from this trick.

grad = similar(x)
 gradient!(f, grad, backend, x)
@@ -29,16 +29,16 @@
   4.0
   6.0
   8.0
- 10.0

The bang indicates that one of the arguments of gradient! might be mutated. More precisely, our convention is that every positional argument between the function and the backend is mutated.

@benchmark gradient!($f, $grad, $backend, $x)
BenchmarkTools.Trial: 10000 samples with 205 evaluations.
- Range (minmax):  375.288 ns114.615 μs   GC (min … max): 0.00% … 99.39%
- Time  (median):     573.215 ns                GC (median):    0.00%
- Time  (mean ± σ):   554.746 ns ±   2.060 μs   GC (mean ± σ):  8.82% ±  3.05%
+ 10.0

The bang indicates that one of the arguments of gradient! might be mutated. More precisely, our convention is that every positional argument between the function and the backend is mutated.

@benchmark gradient!($f, $grad, $backend, $x)
BenchmarkTools.Trial: 10000 samples with 201 evaluations.
+ Range (minmax):  394.119 ns117.838 μs   GC (min … max): 0.00% … 99.43%
+ Time  (median):     592.547 ns                GC (median):    0.00%
+ Time  (mean ± σ):   568.581 ns ±   1.973 μs   GC (mean ± σ):  7.99% ±  3.03%
 
-  ▆█▆▃▂▂▁▁                                ▁▄▅▇▇▆▅▄▃▂▂▁▁▁▁      ▂
-  ██████████▆▆▇▇▆▆▆▆▅▅▅▅▅▅▄▆▆▆▅▅▃▁▁▃▅▃▃▃▁██████████████████▇▇ █
-  375 ns        Histogram: log(frequency) by time        655 ns <
+   ▄█                                          ▃▃▁              
+  ▃███▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▂▂▂▂▂▂▂▂▂▂▂▁▂▁▂▁▂▁▄▆███▆▄▃▃▂▂▂▂▂▂▂▂ ▃
+  394 ns           Histogram: frequency by time          661 ns <
 
- Memory estimate: 528 bytes, allocs estimate: 3.

For some reason the in-place version is not much better than your first attempt. However, it makes fewer allocations, thanks to the gradient vector you provided. Don't worry, you can get even more performance.

Preparing for multiple gradients

Internally, ForwardDiff.jl creates some data structures to keep track of things. These objects can be reused between gradient computations, even on different input values. We abstract away the preparation step behind a backend-agnostic syntax:

prep = prepare_gradient(f, backend, zero(x))
DifferentiationInterfaceForwardDiffExt.ForwardDiffGradientPrep{ForwardDiff.GradientConfig{ForwardDiff.Tag{typeof(Main.f), Float64}, Float64, 5, Vector{ForwardDiff.Dual{ForwardDiff.Tag{typeof(Main.f), Float64}, Float64, 5}}}}(ForwardDiff.GradientConfig{ForwardDiff.Tag{typeof(Main.f), Float64}, Float64, 5, Vector{ForwardDiff.Dual{ForwardDiff.Tag{typeof(Main.f), Float64}, Float64, 5}}}((Partials(1.0, 0.0, 0.0, 0.0, 0.0), Partials(0.0, 1.0, 0.0, 0.0, 0.0), Partials(0.0, 0.0, 1.0, 0.0, 0.0), Partials(0.0, 0.0, 0.0, 1.0, 0.0), Partials(0.0, 0.0, 0.0, 0.0, 1.0)), ForwardDiff.Dual{ForwardDiff.Tag{typeof(Main.f), Float64}, Float64, 5}[Dual{ForwardDiff.Tag{typeof(Main.f), Float64}}(0.0,0.0,0.0,0.0,1.0,6.93094821980567e-310), Dual{ForwardDiff.Tag{typeof(Main.f), Float64}}(2.0,6.93099783620823e-310,1.0,0.0,0.0,0.0), Dual{ForwardDiff.Tag{typeof(Main.f), Float64}}(0.0,0.0,1.0,0.0,0.0,0.0), Dual{ForwardDiff.Tag{typeof(Main.f), Float64}}(0.0,0.0,1.0,0.0,0.0,0.0), Dual{ForwardDiff.Tag{typeof(Main.f), Float64}}(0.0,0.0,1.0,0.0,0.0,0.0)]))

You don't need to know what this object is, you just need to pass it to the gradient operator. Note that preparation does not depend on the actual components of the vector x, just on its type and size. You can thus reuse the prep for different values of the input.

grad = similar(x)
+ Memory estimate: 528 bytes, allocs estimate: 3.

For some reason the in-place version is not much better than your first attempt. However, it makes fewer allocations, thanks to the gradient vector you provided. Don't worry, you can get even more performance.

Preparing for multiple gradients

Internally, ForwardDiff.jl creates some data structures to keep track of things. These objects can be reused between gradient computations, even on different input values. We abstract away the preparation step behind a backend-agnostic syntax:

prep = prepare_gradient(f, backend, zero(x))
DifferentiationInterfaceForwardDiffExt.ForwardDiffGradientPrep{ForwardDiff.GradientConfig{ForwardDiff.Tag{typeof(Main.f), Float64}, Float64, 5, Vector{ForwardDiff.Dual{ForwardDiff.Tag{typeof(Main.f), Float64}, Float64, 5}}}}(ForwardDiff.GradientConfig{ForwardDiff.Tag{typeof(Main.f), Float64}, Float64, 5, Vector{ForwardDiff.Dual{ForwardDiff.Tag{typeof(Main.f), Float64}, Float64, 5}}}((Partials(1.0, 0.0, 0.0, 0.0, 0.0), Partials(0.0, 1.0, 0.0, 0.0, 0.0), Partials(0.0, 0.0, 1.0, 0.0, 0.0), Partials(0.0, 0.0, 0.0, 1.0, 0.0), Partials(0.0, 0.0, 0.0, 0.0, 1.0)), ForwardDiff.Dual{ForwardDiff.Tag{typeof(Main.f), Float64}, Float64, 5}[Dual{ForwardDiff.Tag{typeof(Main.f), Float64}}(1.0,1.0,0.0,0.0,0.0,0.0), Dual{ForwardDiff.Tag{typeof(Main.f), Float64}}(2.0,0.0,1.0,0.0,0.0,0.0), Dual{ForwardDiff.Tag{typeof(Main.f), Float64}}(3.0,0.0,0.0,1.0,0.0,0.0), Dual{ForwardDiff.Tag{typeof(Main.f), Float64}}(4.0,0.0,0.0,0.0,1.0,0.0), Dual{ForwardDiff.Tag{typeof(Main.f), Float64}}(5.0,0.0,0.0,0.0,0.0,1.0)]))

You don't need to know what this object is, you just need to pass it to the gradient operator. Note that preparation does not depend on the actual components of the vector x, just on its type and size. You can thus reuse the prep for different values of the input.

grad = similar(x)
 gradient!(f, grad, prep, backend, x)
 grad  # has been mutated
5-element Vector{Float64}:
   2.0
@@ -46,13 +46,13 @@
   6.0
   8.0
  10.0

Preparation makes the gradient computation much faster, and (in this case) allocation-free.

@benchmark gradient!($f, $grad, $prep, $backend, $x)
BenchmarkTools.Trial: 10000 samples with 995 evaluations.
- Range (minmax):  28.164 ns98.557 ns   GC (min … max): 0.00% … 0.00%
- Time  (median):     28.808 ns               GC (median):    0.00%
- Time  (mean ± σ):   29.150 ns ±  2.141 ns   GC (mean ± σ):  0.00% ± 0.00%
+ Range (minmax):  27.387 ns47.113 ns   GC (min … max): 0.00% … 0.00%
+ Time  (median):     27.649 ns               GC (median):    0.00%
+ Time  (mean ± σ):   27.772 ns ±  0.838 ns   GC (mean ± σ):  0.00% ± 0.00%
 
-    ▆                                                        
-  ▂▆█▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▂▁▁▂▁▂▁▁▂▁▂▂▂▂▂▁▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂ ▂
-  28.2 ns         Histogram: frequency by time        39.9 ns <
+     ▅██▆                                                   
+  ▂▅████████▆▅▄▃▃▃▃▃▃▂▂▂▂▂▂▂▂▂▂▂▂▁▂▂▂▂▁▁▂▂▁▁▁▂▂▂▁▁▂▂▂▁▂▂▂▂▂ ▃
+  27.4 ns         Histogram: frequency by time        29.8 ns <
 
  Memory estimate: 0 bytes, allocs estimate: 0.

Beware that the prep object is nearly always mutated by differentiation operators, even though it is given as the last positional argument.

Switching backends

The whole point of DifferentiationInterface.jl is that you can easily experiment with different AD solutions. Typically, for gradients, reverse mode AD might be a better fit, so let's try Zygote.jl!

import Zygote
 
@@ -64,17 +64,17 @@
  10.0

And you can run the same benchmarks to see what you gained (although such a small input may not be realistic):

prep2 = prepare_gradient(f, backend2, zero(x))
 
 @benchmark gradient!($f, $grad, $prep2, $backend2, $x)
BenchmarkTools.Trial: 10000 samples with 994 evaluations.
- Range (minmax):  31.981 ns 30.271 μs   GC (min … max):  0.00% … 97.31%
- Time  (median):     58.061 ns                GC (median):     0.00%
- Time  (mean ± σ):   64.280 ns ± 524.258 ns   GC (mean ± σ):  15.97% ±  1.98%
+ Range (minmax):  32.021 ns 28.805 μs   GC (min … max):  0.00% … 99.66%
+ Time  (median):     57.966 ns                GC (median):     0.00%
+ Time  (mean ± σ):   63.048 ns ± 500.220 ns   GC (mean ± σ):  15.62% ±  1.99%
 
-   █▆                                          ▂▁▂              
-  ▆██▇▃▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▄▆█▅▃▂▂▁▁▁▂▅▆███▇▄▂▂▂▂▁▁▁▁▁▁ ▂
-  32 ns           Histogram: frequency by time         78.9 ns <
+   ▅█                                                           
+  ▄██▆▃▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▁▂▂▂▂▂▂▂▅▆█▆▅▄▃▂▂▂▃▃▅▅▇█▇▇▅▃▂▂▂▂ ▃
+  32 ns           Histogram: frequency by time         73.8 ns <
 
  Memory estimate: 96 bytes, allocs estimate: 2.

In short, DifferentiationInterface.jl allows for easy testing and comparison of AD backends. If you want to go further, check out the documentation of DifferentiationInterfaceTest.jl. This related package provides benchmarking utilities to compare backends and help you select the one that is best suited for your problem.

+