Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unused variables are added to Direct solvers #1490

Open
ruaridhw opened this issue Jun 8, 2020 · 11 comments
Open

Unused variables are added to Direct solvers #1490

ruaridhw opened this issue Jun 8, 2020 · 11 comments

Comments

@ruaridhw
Copy link
Contributor

ruaridhw commented Jun 8, 2020

In DirectOrPersistentSolver._add_block() all variables are looped over, added to the model, and then _referenced_variables is only computed after the fact as part of adding the constraints etc.

How does one deactivate variables? The only suggestion I can find is to fix them however this will still cause them to be added to the model unnecessarily. For very large models that are solved iteratively in a decomposition routine, this is very slow. In contrast, the (CPLEX) LP solver is very fast for models where only a small fraction of variables are relevant since it checks over _referenced_variable_ids.

If we cannot deactivate variables, how would we get around this? The only thing I can think of is a design where all the variable data to be sent to the Direct solver is cached until all the constraints, SOS, and objective(s) have also been computed, the variables are then filtered according to _referenced_variables and added, and finally the constraints, SOS, and objective(s) are added. This would follow on from the deferred logic of #1416.

@ghackebeil
Copy link
Member

This discussion is likely going to open up a big can of worms. In short, I have added the ability to deactivate variables to pyomo.kernel. I hope this can be pushed to the pyomo.environ interface to enable the kinds of performance improvements you mentioned.

@michaelbynum
Copy link
Contributor

A couple comments.

  1. The proposed approach could (I have not tested it, so I don't know for sure) slow down the addition of new constraints since each time add_constraint is called, you now have to ensure all of the variables used in the constraint are already in the CPLEX model.

  2. I would find it slightly strange to have a model where a large potion of the variables are not used in constraints or the objective. Do you have an example application or algorithm where this is the case? In most cases, it should be relatively easy to control which variables go to the solver by controlling which blocks are active.

@michaelbynum
Copy link
Contributor

@ghackebeil Makes a good point.

@ruaridhw
Copy link
Contributor Author

ruaridhw commented Jun 8, 2020

The proposed approach could (I have not tested it, so I don't know for sure) slow down the addition of new constraints since each time add_constraint is called, you now have to ensure all of the variables used in the constraint are already in the CPLEX model.

I think this would be pretty fast considering you have access to _referenced_variables inside add_constraint() which is the Set of all of the VarData already added. It would be a quick set diff to compare the new variables to that.

I would find it slightly strange to have a model where a large potion of the variables are not used in constraints or the objective. Do you have an example application or algorithm where this is the case? In most cases, it should be relatively easy to control which variables go to the solver by controlling which blocks are active.

As an example, in the Job Shop scheduling problem it is typical to decompose the problem by "machine" even though many "job"s can go to multiple machines. This means the vast majority of the allocation variables of the jobs you are solving for are irrelevant (as they are for other machines).

Thanks, @ghackebeil! That's great to know.

@michaelbynum
Copy link
Contributor

@ruaridhw You are probably right. The set difference is probably negligible, in which case I would be fine with the change you proposed.

@ruaridhw
Copy link
Contributor Author

ruaridhw commented Jun 8, 2020

You're correct in that it will be (marginally) slower for incrementally adding constraints/variables. I guess I'm going for the approach of optimising the build of a full model because IMO for large applications, this is likely to be a limiting factor faster than incremental model changes.

The above approach would also avoid needing to deactivate variables as fixing them would have the same effect and ensure that they are never added to the solver_model.

@michaelbynum
Copy link
Contributor

Fixed variables are trickier. When a variable is unfixed, you have to ensure that any constraints already added to the model that depend on that variable get updated. This is the primary reason fixed variables are added to the solver_model.

@michaelbynum
Copy link
Contributor

I am currently working on refactoring the persistent solver interfaces. The refactor should take care of some of these issues (and adds some features like an automatic update mode), but it won't be ready for another year or so.

@ruaridhw
Copy link
Contributor Author

ruaridhw commented Jun 9, 2020

Fixed variables are trickier. When a variable is unfixed, you have to ensure that any constraints already added to the model that depend on that variable get updated. This is the primary reason fixed variables are added to the solver_model.

Correct me if I'm wrong but I thought this was the whole reason for #1244? In other words, this isn't actually true of the current design either?

Also it's worth pointing out that my title refers to "Direct" solvers. I realise that currently Direct and Persistent solvers share a lot of the same code however due to the bugs in the Persistent interface (such as #1244) that are not apparent in the Direct, I think it should be easier to implement this for Direct only for the time being.

@michaelbynum
Copy link
Contributor

The key to #1244 is that the bug only occurs if the variables are fixed when the constraints are first added. If the variables are not fixed when the constraints, then you can fix and unfix variables later with the correct behavior.

@michaelbynum
Copy link
Contributor

But yes, you make a good point about the distinction between direct and persistent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants