Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should we simplify NonLinearExpr similar to AffExpr or QuadExpr? #3498

Closed
mitchphillipson opened this issue Sep 11, 2023 · 11 comments
Closed
Labels
Category: Nonlinear Related to nonlinear programming Type: Feature request

Comments

@mitchphillipson
Copy link

mitchphillipson commented Sep 11, 2023

Explanation of Issue

I'll start this with an example of the issue

julia> using JuMP;
julia> A = [:a,:b];
julia> B = [:c,:d,:e];
julia> p = JuMP.Containers.DenseAxisArray([1 0 3; 0 5 0], A,B);
julia> σ = JuMP.Containers.DenseAxisArray([.5, .75, .25], B);
julia> M = Model();
julia> @variable(M, x[A,B]);
julia> @expression(M, test[a=A], sum(x[a,b]^2*p[a,b] for b∈B));
julia> @expression(M, test_nl[a=A], sum(x[a,b]^(σ[b])*p[a,b] for b∈B));

The issue is the difference between

julia> test
1-dimensional DenseAxisArray{QuadExpr,1,...} with index sets:
    Dimension 1, [:a, :b]
And data, a 2-element Vector{QuadExpr}:
 x[a,c]² + 3 x[a,e]²
 5 x[b,d]²

and

julia> test_nl
1-dimensional DenseAxisArray{NonlinearExpr,1,...} with index sets:
    Dimension 1, [:a, :b]
And data, a 2-element Vector{NonlinearExpr}:
 ((x[a,c] ^ 0.5) * 1.0) + ((x[a,d] ^ 0.75) * 0.0) + ((x[a,e] ^ 0.25) * 3.0)
 ((x[b,c] ^ 0.5) * 0.0) + ((x[b,d] ^ 0.75) * 5.0) + ((x[b,e] ^ 0.25) * 0.0)

Notice both test and test_nl differ only in powers, but test drops any term multiplied by 0 whereas test_nl does not. This is deeper than just the @show method, as can be seen here:

julia> test[:b].terms
OrderedCollections.OrderedDict{UnorderedPair{VariableRef}, Float64} with 1 entry:
  UnorderedPair{VariableRef}(x[b,d], x[b,d]) => 5.0
julia> test_nl[:b].args
3-element Vector{Any}:
 (x[b,c] ^ 0.5) * 0.0
 (x[b,d] ^ 0.75) * 5.0
 (x[b,e] ^ 0.25) * 0.0

The zero terms no longer exist in test.

I typically work with large, sparse matrices meaning there are many zero terms. Dropping these zero terms reduces the model complexity and agrees with what one would expect.

This may also speed up the evaluator in the situation, as an extreme example, where you have an $n\times n$ diagonal matrix and a constraint of the form $Ax^m = b$. If $m=1,2$ there are $n$ evaluations, otherwise there are currently $n^2$.

Solution Suggestion

My suggestion to fix this is to implement drop_zeros! on NonLinearExpr. I'm guessing that is the culprit. This is more complicated on a NonLinearExpr as it requires recursing the leaves of a tree.

One compounding issue

In general NonLinearExpr don't simplify

julia> M = Model();

julia> @variable(M,x);

julia> sum(x+i for i∈1:4)
4 x + 10

julia> sum(x^5+i for i∈1:4)
((((x ^ 5) + 1.0) + ((x ^ 5) + 2.0)) + ((x ^ 5) + 3.0)) + ((x ^ 5) + 4.0)

This is related to the above, but more general. This is something to keep in mind when working on the above, but implementing this may be non-trivial.

*edit: flatten! gets close to fixing the above

julia> flatten!(sum(x^5+i for i∈1:4))
(x ^ 5) + (x ^ 5) + (x ^ 5) + (x ^ 5) + 1.0 + 2.0 + 3.0 + 4.0

but not quite perfect.

@odow odow added Category: Nonlinear Related to nonlinear programming Type: Feature request labels Sep 11, 2023
@odow
Copy link
Member

odow commented Sep 11, 2023

I was hesitant to make too many changes to the expression graph. For example, which of the following is desired?

julia> flatten!(sum(x^5+i for i1:4))
(x ^ 5) + (x ^ 5) + (x ^ 5) + (x ^ 5) + 1.0 + 2.0 + 3.0 + 4.0

julia> flatten!(sum(x^5+i for i1:4))
(x ^ 5) + (x ^ 5) + (x ^ 5) + (x ^ 5) + 10.0

julia> flatten!(sum(x^5+i for i1:4))
4 * (x ^ 5) + 10.0

But I think it is reasonable to drop any + 0.0 or * 0.0 terms. Other changes...I'm not so sure.

@mitchphillipson
Copy link
Author

’flatten!’ does a decent job of simplifying as is. Obviously, I'd prefer the reverse order of the examples you gave, but can see how issues would arise.

Dropping zeros would be sufficient, at least for the moment, and if I need more I'll write and propose a ’simplify’ function or something similar.

@odow
Copy link
Member

odow commented Sep 12, 2023

We had a discussion here: #3502

One issue that came up is users might expect us to simplify more than just dropping 0, but that's a much larger design challenge.

If you are reading this issue and have an example in which simplifying would make a difference, please post it. If we get enough requests, we may revisit this problem.

@odow odow changed the title Simplifying new NonLinearExpr similar to AffExpr or QuadExpr Should we simplify NonLinearExpr similar to AffExpr or QuadExpr Sep 12, 2023
@odow odow changed the title Should we simplify NonLinearExpr similar to AffExpr or QuadExpr Should we simplify NonLinearExpr similar to AffExpr or QuadExpr? Sep 13, 2023
@odow
Copy link
Member

odow commented Sep 14, 2023

Discussed with @ccoffrin and @Robbybp. Conclusion is that we shouldn't do this by default. But we should consider it in the context of larger work, such as a nonlinear presolve algorithm.

@odow
Copy link
Member

odow commented Sep 15, 2023

@mcosovic suggested the drop_zeros! thing as well in #3510, so I should take another look at this.

@Downsite
Copy link

Just to weigh in on this:

Currently, I have the problem that (x-0.1)^2 + sin(x) seems to be rewritten as x*x-0.2*x+0.01+sin(x), most likely because a quadratic expression is built first. In the context of global optimization, the way the expression is written can matter greatly because of interval arithmetic. I think is quite easy to ruin the intended formulation with obvious reformulations and I would recommend not doing them automatically unless tools to configure or disable this reformulation are also in place.

@odow
Copy link
Member

odow commented Mar 25, 2024

I think is quite easy to ruin the intended formulation with obvious reformulations and I would recommend not doing them automatically unless tools to configure or disable this reformulation are also in place.

Yes, I think this is our conclusion too.

most likely because a quadratic expression is built first

This happens for any quadratic terms, because they get represented as QuadExpr subexpressions instead of NonlinearExpr:

julia> model = Model();

julia> @variable(model, x)
x

julia> @expression(model, sin(x) + (x - 0.1)^2)
sin(x) + (x² - 0.2 x + 0.010000000000000002)

Because of the way that JuMP parses the macros, there is no easy way to override this behavior.

@Downsite
Copy link

Downsite commented Mar 26, 2024

This happens for any quadratic terms, because they get represented as QuadExpr subexpressions instead of NonlinearExpr:

julia> model = Model();

julia> @variable(model, x)
x

julia> @expression(model, sin(x) + (x - 0.1)^2)
sin(x) + (x² - 0.2 x + 0.010000000000000002)

Because of the way that JuMP parses the macros, there is no easy way to override this behavior.

My current workaround is adding a dummy variable fixed to zero and formulate

 @expression(model, sin(x) + (dummy^3 + x - 0.1)^2)

@odow
Copy link
Member

odow commented May 2, 2024

@Downsite we added @force_nonlinear in #3732, so you can write:

julia> model = Model(); @variable(model, x);

julia> @expression(model, @force_nonlinear sin(x) + (x - 0.1)^2)
sin(x) + ((x - 0.1) ^ 2)

julia> @expression(model, sin(x) + @force_nonlinear((x - 0.1)^2))
sin(x) + ((x - 0.1) ^ 2)

julia> @force_nonlinear @expression(model, sin(x) + (x - 0.1)^2)
sin(x) + ((x - 0.1) ^ 2)

For the main issue, I think our answer to this is "no". Adding simplification rules by default in JuMP just introduces the risk of problems. We want the nonlinear expressions to be passed to the solver pretty much as-is.

In the future, we could consider adding a NonlinearPresolve.Optimizer layer, that users could opt-in to.

@odow
Copy link
Member

odow commented May 29, 2024

I'm tempted to close this. I don't know if there's anything actionable left to do here.

@odow
Copy link
Member

odow commented Jun 20, 2024

Closing as won't-fix for now. We can always revisit this in the future, but I think the first step is to develop some sort of MOI nonlinear presolve layer to test ideas. This doesn't need to happen in JuMP.jl, and it doesn't need to be one of the core contributors, so if you are reading this comment and are interested, feel free to explore with some ideas in a new repository 😄

@odow odow closed this as completed Jun 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Category: Nonlinear Related to nonlinear programming Type: Feature request
Development

No branches or pull requests

3 participants