jump-dev · odow · Feb 29, 2024 · Feb 26, 2024 · Feb 26, 2024 · Feb 26, 2024
diff --git a/docs/Project.toml b/docs/Project.toml
@@ -9,6 +9,7 @@ Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
 DocumenterCitations = "daee34ce-89f3-4625-b898-19384cb65244"
 Downloads = "f43a241f-c20a-4ad4-852c-f6b1247861c6"
 Dualization = "191a621a-6537-11e9-281d-650236a99e60"
+ForwardDiff = "f6369f11-7733-5829-9624-2563aa707210"
 GLPK = "60bf3e95-4087-53dc-ae20-288a0d20c6a6"
 HTTP = "cd3eb016-35fb-5094-929b-558a96fad6f3"
 HiGHS = "87dc4568-4c63-4d18-b0c0-bb2238e4078b"

diff --git a/docs/src/manual/nonlinear.md b/docs/src/manual/nonlinear.md
@@ -417,8 +417,12 @@ using the [`@operator`](@ref) macro.
 ## [User-defined operators](@id jump_user_defined_operators)
 
 In addition to a standard list of univariate and multivariate operators
-recognized by the `MOI.Nonlinear` submodule, JuMP supports *user-defined*
-Julia operators.
+recognized by the `MOI.Nonlinear` submodule, JuMP supports user-defined
+operators.
+
+Unless [Gradients and Hessians](@ref) are explicitly provided, user-defined
+operators must support automatic differentiation. If you encounter an error
+adding the operator, see [Common mistakes when writing a user-defined operator](@ref).
 
 !!! warning
     User-defined operators must return a scalar output. For a work-around, see
@@ -622,53 +626,166 @@ f_splat(x..., y..., z)
 @objective(model, Min, op_f(x..., y..., z))
 ```
 
-### Automatic differentiation
+### Common mistakes when writing a user-defined operator
 
-JuMP does not support black-box optimization, so all user-defined operators must
-provide derivatives in some form. Fortunately, JuMP supports automatic
-differentiation of user-defined operators.
+When nonlinear expressions are provided algebraically (either via direct input
+into the macros or via [Function tracing](@ref)), JuMP computes first- and
+second-order derivatives using sparse reverse-mode automatic differentiation.
+For details, see [ReverseAD](@ref).
 
-!!! info
-    Automatic differentiation is *not* finite differencing. JuMP's automatically
-    computed derivatives are not subject to approximation error.
+However, because [ReverseAD](@ref) requires the algebraic expression as input,
+JuMP cannot use [ReverseAD](@ref) to differentiate user-defined operators.
 
+Instead, unless you provide your own derivatives with [Gradients and Hessians](@ref),
 JuMP uses [ForwardDiff.jl](https://github.com/JuliaDiff/ForwardDiff.jl) to
-perform automatic differentiation of user-defined operators; see the ForwardDiff.jl
-[documentation](https://www.juliadiff.org/ForwardDiff.jl/v0.10.2/user/limitations.html)
-for a description of how to write a function suitable for automatic
-differentiation.
+compute derivatives.
 
-#### Common mistakes when writing a user-defined operator
+ForwardDiff has a number of limitations that you should be aware of when
+writing user-defined operators. The rest of this section provides debugging
+advice and explains some common mistakes.
 
 !!! warning
     Get an error like `No method matching Float64(::ForwardDiff.Dual)`? Read
-    this section, and see the guidelines at [ForwardDiff.jl](https://www.juliadiff.org/ForwardDiff.jl/release-0.10/user/limitations.html).
+    this section.
 
-The most common error is that your user-defined operator is not generic with
-respect to the number type, that is, don't assume that the input to the function
-is `Float64`.
-```julia
-f(x::Float64) = 2 * x  # This will not work.
-f(x::Real)    = 2 * x  # This is good.
-f(x)          = 2 * x  # This is also good.
+#### Debugging
+
+If you add an operator that does not support ForwardDiff, a long error message
+will be printed. You can review the stacktrace for more information, but it can
+often be hard to understand why and where your function is failing.
+
+It may be helpful to debug the operator outside of JuMP as follows.
+
+If the operator is univariate, do:
+```jldoctest
+julia> import ForwardDiff
+
+julia> my_operator(a) = a^2
+my_operator (generic function with 1 method)
+
+julia> ForwardDiff.derivative(my_operator, 1.0)
+2.0
+```
+
+If the operator is multivariate, do:
+```jldoctest
+julia> import ForwardDiff
+
+julia> my_operator(a, b) = a^2 + b^2
+my_operator (generic function with 1 method)
+
+julia> ForwardDiff.gradient(x -> my_operator(x...), [1.0, 2.0])
+2-element Vector{Float64}:
+ 2.0
+ 4.0
+```
+Note that even though the operator takes the splatted arguments,
+`ForwardDiff.gradient` requires a vector as input.
+
+#### Operator calls something unsupported by ForwardDiff
+
+ForwardDiff works by overloading many Julia functions for a special type
+`ForwardDiff.Dual <: Real`. If your operator attempts to call a function for
+which an overload has not been defined, a `MethodError` will be thrown.
+
+For example, your operator cannot call external C functions, or be the optimal
+objective value of a JuMP model.
+
+```jldoctest
+julia> import ForwardDiff
+
+julia> my_operator_bad(x) = @ccall sqrt(x::Cdouble)::Cdouble
+my_operator_bad (generic function with 1 method)
+
+julia> ForwardDiff.derivative(my_operator_bad, 1.0)
+ERROR: MethodError: no method matching Float64(::ForwardDiff.Dual{ForwardDiff.Tag{typeof(my_operator_bad), Float64}, Float64, 1})
+[...]
+```
+
+Unfortunately, the list of calls supported by ForwardDiff is too large to
+enumerate what is an isn't allowed, so the best advice is to try and see if it
+works.
+
+#### Operator does not accept splatted input
+
+The operator takes `f(x::Vector)` as input, instead of the splatted `f(x...)`.
+
+```jldoctest
+julia> import ForwardDiff
+
+julia> my_operator_bad(x::Vector) = sum(x[i]^2 for i in eachindex(x))
+my_operator_bad (generic function with 1 method)
+
+julia> my_operator_good(x...) = sum(x[i]^2 for i in eachindex(x))
+my_operator_good (generic function with 1 method)
+
+julia> ForwardDiff.gradient(x -> my_operator_bad(x...), [1.0, 2.0])
+ERROR: MethodError: no method matching my_operator_bad(::ForwardDiff.Dual{ForwardDiff.Tag{var"#5#6", Float64}, Float64, 2}, ::ForwardDiff.Dual{ForwardDiff.Tag{var"#5#6", Float64}, Float64, 2})
+[...]
+
+julia> ForwardDiff.gradient(x -> my_operator_good(x...), [1.0, 2.0])
+2-element Vector{Float64}:
+ 2.0
+ 4.0
+```
+
+#### Operator assumes `Float64` as input
+
+The operator assumes `Float64` will be passed as input, but it must work for any
+generic `Real` type.
+
+```jldoctest
+julia> import ForwardDiff
+
+julia> my_operator_bad(x::Float64...) = sum(x[i]^2 for i in eachindex(x))
+my_operator_bad (generic function with 1 method)
+
+julia> my_operator_good(x::Real...) = sum(x[i]^2 for i in eachindex(x))
+my_operator_good (generic function with 1 method)
+
+julia> ForwardDiff.gradient(x -> my_operator_bad(x...), [1.0, 2.0])
+ERROR: MethodError: no method matching my_operator_bad(::ForwardDiff.Dual{ForwardDiff.Tag{var"#5#6", Float64}, Float64, 2}, ::ForwardDiff.Dual{ForwardDiff.Tag{var"#5#6", Float64}, Float64, 2})
+[...]
+
+julia> ForwardDiff.gradient(x -> my_operator_good(x...), [1.0, 2.0])
+2-element Vector{Float64}:
+ 2.0
+ 4.0
 ```
 
-Another reason you may encounter this error is if you create arrays inside
-your function which are `Float64`.
+#### Operator allocates `Float64` storage
+
+The operator allocates temporary storage using `zeros(3)` or similar. This
+defaults to `Float64`, so use `zeros(T, 3)` instead.
+
 ```julia
-function bad_f(x...)
-    y = zeros(length(x))  # This constructs an array of `Float64`!
-    for i in 1:length(x)
-        y[i] = x[i]^i
-    end
-    return sum(y)
-end
+julia> import ForwardDiff
+
+julia> function my_operator_bad(x::Real...)
+           # This line is problematic. zeros(n) is short for zeros(Float64, n)
+           y = zeros(length(x))
+           for i in eachindex(x)
+               y[i] = x[i]^2
+           end
+           return sum(y)
+       end
+my_operator_bad (generic function with 1 method)
+
+julia> function my_operator_good(x::T...) where {T<:Real}
+           y = zeros(T, length(x))
+           for i in eachindex(x)
+               y[i] = x[i]^2
+           end
+           return sum(y)
+       end
+my_operator_good (generic function with 1 method)
+
+julia> ForwardDiff.gradient(x -> my_operator_bad(x...), [1.0, 2.0])
+ERROR: MethodError: no method matching Float64(::ForwardDiff.Dual{ForwardDiff.Tag{var"#1#2", Float64}, Float64, 2})
+[...]
 
-function good_f(x::T...) where {T<:Real}
-    y = zeros(T, length(x))  # Construct an array of type `T` instead!
-    for i in 1:length(x)
-        y[i] = x[i]^i
-    end
-    return sum(y)
-end
+julia> ForwardDiff.gradient(x -> my_operator_good(x...), [1.0, 2.0])
+2-element Vector{Float64}:
+ 2.0
+ 4.0
 ```
diff --git a/docs/styles/Vocab/JuMP-Vocab/accept.txt b/docs/styles/Vocab/JuMP-Vocab/accept.txt
@@ -39,7 +39,7 @@ preprint
 README
 recurse
 reimplemented
-Stacktrace
+[Ss]tacktrace
 subexpression(?s)
 subgraph(?s)
 subtyping