Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[docs] clarify section on automatic differentiation in nonlinear.md #3683

Merged
merged 7 commits into from
Feb 29, 2024
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/Project.toml
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
DocumenterCitations = "daee34ce-89f3-4625-b898-19384cb65244"
Downloads = "f43a241f-c20a-4ad4-852c-f6b1247861c6"
Dualization = "191a621a-6537-11e9-281d-650236a99e60"
ForwardDiff = "f6369f11-7733-5829-9624-2563aa707210"
GLPK = "60bf3e95-4087-53dc-ae20-288a0d20c6a6"
HTTP = "cd3eb016-35fb-5094-929b-558a96fad6f3"
HiGHS = "87dc4568-4c63-4d18-b0c0-bb2238e4078b"
Expand Down
193 changes: 155 additions & 38 deletions docs/src/manual/nonlinear.md
Original file line number Diff line number Diff line change
Expand Up @@ -417,8 +417,12 @@ using the [`@operator`](@ref) macro.
## [User-defined operators](@id jump_user_defined_operators)

In addition to a standard list of univariate and multivariate operators
recognized by the `MOI.Nonlinear` submodule, JuMP supports *user-defined*
Julia operators.
recognized by the `MOI.Nonlinear` submodule, JuMP supports user-defined
operators.
odow marked this conversation as resolved.
Show resolved Hide resolved

Unless [Gradients and Hessians](@ref) are explicitly provided, user-defined
operators must support automatic differentiation. If you encounter an error
adding the operator, see [Common mistakes when writing a user-defined operator](@ref).

!!! warning
User-defined operators must return a scalar output. For a work-around, see
Expand Down Expand Up @@ -622,53 +626,166 @@ f_splat(x..., y..., z)
@objective(model, Min, op_f(x..., y..., z))
```

### Automatic differentiation
### Common mistakes when writing a user-defined operator

JuMP does not support black-box optimization, so all user-defined operators must
provide derivatives in some form. Fortunately, JuMP supports automatic
differentiation of user-defined operators.
When nonlinear expressions are provided algebraically (either via direct input
into the macros or via [Function tracing](@ref)), JuMP computes first- and
second-order derivatives using sparse reverse-mode automatic differentiation.
For details, see [ReverseAD](@ref).
odow marked this conversation as resolved.
Show resolved Hide resolved

!!! info
Automatic differentiation is *not* finite differencing. JuMP's automatically
computed derivatives are not subject to approximation error.
However, because [ReverseAD](@ref) requires the algebraic expression as input,
JuMP cannot use [ReverseAD](@ref) to differentiate user-defined operators.

Instead, unless you provide your own derivatives with [Gradients and Hessians](@ref),
JuMP uses [ForwardDiff.jl](https://github.com/JuliaDiff/ForwardDiff.jl) to
perform automatic differentiation of user-defined operators; see the ForwardDiff.jl
[documentation](https://www.juliadiff.org/ForwardDiff.jl/v0.10.2/user/limitations.html)
for a description of how to write a function suitable for automatic
differentiation.
compute derivatives.

#### Common mistakes when writing a user-defined operator
ForwardDiff has a number of limitations that you should be aware of when
odow marked this conversation as resolved.
Show resolved Hide resolved
writing user-defined operators. The rest of this section provides debugging
advice and explains some common mistakes.

!!! warning
Get an error like `No method matching Float64(::ForwardDiff.Dual)`? Read
this section, and see the guidelines at [ForwardDiff.jl](https://www.juliadiff.org/ForwardDiff.jl/release-0.10/user/limitations.html).
this section.

The most common error is that your user-defined operator is not generic with
respect to the number type, that is, don't assume that the input to the function
is `Float64`.
```julia
f(x::Float64) = 2 * x # This will not work.
f(x::Real) = 2 * x # This is good.
f(x) = 2 * x # This is also good.
#### Debugging

If you add an operator that does not support ForwardDiff, a long error message
will be printed. You can review the stacktrace for more information, but it can
often be hard to understand why and where your function is failing.

It may be helpful to debug the operator outside of JuMP as follows.

If the operator is univariate, do:
```jldoctest
julia> import ForwardDiff

julia> my_operator(a) = a^2
my_operator (generic function with 1 method)

julia> ForwardDiff.derivative(my_operator, 1.0)
2.0
```

If the operator is multivariate, do:
```jldoctest
julia> import ForwardDiff

julia> my_operator(a, b) = a^2 + b^2
my_operator (generic function with 1 method)

julia> ForwardDiff.gradient(x -> my_operator(x...), [1.0, 2.0])
2-element Vector{Float64}:
2.0
4.0
```
Note that even though the operator takes the splatted arguments,
`ForwardDiff.gradient` requires a vector as input.

#### Operator calls something unsupported by ForwardDiff

ForwardDiff works by overloading many Julia functions for a special type
`ForwardDiff.Dual <: Real`. If your operator attempts to call a function for
which an overload has not been defined, a `MethodError` will be thrown.

For example, your operator cannot call external C functions, or be the optimal
objective value of a JuMP model.

```jldoctest
julia> import ForwardDiff

julia> my_operator_bad(x) = @ccall sqrt(x::Cdouble)::Cdouble
my_operator_bad (generic function with 1 method)

julia> ForwardDiff.derivative(my_operator_bad, 1.0)
ERROR: MethodError: no method matching Float64(::ForwardDiff.Dual{ForwardDiff.Tag{typeof(my_operator_bad), Float64}, Float64, 1})
[...]
```

Unfortunately, the list of calls supported by ForwardDiff is too large to
enumerate what is an isn't allowed, so the best advice is to try and see if it
works.

#### Operator does not accept splatted input

The operator takes `f(x::Vector)` as input, instead of the splatted `f(x...)`.

```jldoctest
julia> import ForwardDiff

julia> my_operator_bad(x::Vector) = sum(x[i]^2 for i in eachindex(x))
my_operator_bad (generic function with 1 method)

julia> my_operator_good(x...) = sum(x[i]^2 for i in eachindex(x))
my_operator_good (generic function with 1 method)

julia> ForwardDiff.gradient(x -> my_operator_bad(x...), [1.0, 2.0])
ERROR: MethodError: no method matching my_operator_bad(::ForwardDiff.Dual{ForwardDiff.Tag{var"#5#6", Float64}, Float64, 2}, ::ForwardDiff.Dual{ForwardDiff.Tag{var"#5#6", Float64}, Float64, 2})
[...]

julia> ForwardDiff.gradient(x -> my_operator_good(x...), [1.0, 2.0])
2-element Vector{Float64}:
2.0
4.0
```

#### Operator assumes `Float64` as input

The operator assumes `Float64` will be passed as input, but it must work for any
generic `Real` type.

```jldoctest
julia> import ForwardDiff

julia> my_operator_bad(x::Float64...) = sum(x[i]^2 for i in eachindex(x))
my_operator_bad (generic function with 1 method)

julia> my_operator_good(x::Real...) = sum(x[i]^2 for i in eachindex(x))
my_operator_good (generic function with 1 method)

julia> ForwardDiff.gradient(x -> my_operator_bad(x...), [1.0, 2.0])
ERROR: MethodError: no method matching my_operator_bad(::ForwardDiff.Dual{ForwardDiff.Tag{var"#5#6", Float64}, Float64, 2}, ::ForwardDiff.Dual{ForwardDiff.Tag{var"#5#6", Float64}, Float64, 2})
[...]

julia> ForwardDiff.gradient(x -> my_operator_good(x...), [1.0, 2.0])
2-element Vector{Float64}:
2.0
4.0
```

Another reason you may encounter this error is if you create arrays inside
your function which are `Float64`.
#### Operator allocates `Float64` storage

The operator allocates temporary storage using `zeros(3)` or similar. This
defaults to `Float64`, so use `zeros(T, 3)` instead.

```julia
function bad_f(x...)
y = zeros(length(x)) # This constructs an array of `Float64`!
for i in 1:length(x)
y[i] = x[i]^i
end
return sum(y)
end
julia> import ForwardDiff

julia> function my_operator_bad(x::Real...)
# This line is problematic. zeros(n) is short for zeros(Float64, n)
y = zeros(length(x))
for i in eachindex(x)
y[i] = x[i]^2
end
return sum(y)
end
my_operator_bad (generic function with 1 method)

julia> function my_operator_good(x::T...) where {T<:Real}
y = zeros(T, length(x))
for i in eachindex(x)
y[i] = x[i]^2
end
return sum(y)
end
my_operator_good (generic function with 1 method)

julia> ForwardDiff.gradient(x -> my_operator_bad(x...), [1.0, 2.0])
ERROR: MethodError: no method matching Float64(::ForwardDiff.Dual{ForwardDiff.Tag{var"#1#2", Float64}, Float64, 2})
[...]

function good_f(x::T...) where {T<:Real}
y = zeros(T, length(x)) # Construct an array of type `T` instead!
for i in 1:length(x)
y[i] = x[i]^i
end
return sum(y)
end
julia> ForwardDiff.gradient(x -> my_operator_good(x...), [1.0, 2.0])
2-element Vector{Float64}:
2.0
4.0
```
2 changes: 1 addition & 1 deletion docs/styles/Vocab/JuMP-Vocab/accept.txt
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ preprint
README
recurse
reimplemented
Stacktrace
[Ss]tacktrace
subexpression(?s)
subgraph(?s)
subtyping
Expand Down
Loading