Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GrayBox] add Hessian support #90

Closed
odow opened this issue Aug 28, 2024 · 5 comments · Fixed by #100
Closed

[GrayBox] add Hessian support #90

odow opened this issue Aug 28, 2024 · 5 comments · Fixed by #100

Comments

@odow
Copy link
Collaborator

odow commented Aug 28, 2024

Suggested by @pulsipher in #82

As discussed in #70, it would be very nice to automate the embedding of NNs (and other predictors) as black-box functions that are treated as nonlinear operators. In my research focusing on using smooth NNs in nonlinear optimal control formulations, we have found that treating the NN as an operator that gets its derivatives from the ML environment (e.g., using torch.func from PyTorch) significantly outperforms embedding the NN as algebraic constraints (benchmarking OMLT against using PyNumero's greybox API).

Naturally, JuMP's nonlinear operator API is scalarized, so I am not sure how well it will work for predictors with many inputs and outputs. This definitely motivates the milestone to add vectorized operator support in JuMP.

To which I replied

For black-box outputs, we could automate wrapping @operator and building the appropriate derivatives. And for vector-valued, we could also automate the memoization stuff.

@odow odow mentioned this issue Aug 28, 2024
@odow
Copy link
Collaborator Author

odow commented Aug 28, 2024

I have a prototype, but it needs jump-dev/MathOptInterface.jl#2534 for input=1 models.

@odow odow changed the title Black-box transformations [GrayBox] add Hessian support Aug 29, 2024
@odow
Copy link
Collaborator Author

odow commented Aug 29, 2024

#96 implements most of what we want out of this.

@pulsipher would like the ability to add Hessians to the nonlinear callback

@odow
Copy link
Collaborator Author

odow commented Aug 29, 2024

We also discussed the ability to re-use the operator for different inputs.

But I don't know that I like it because it would interfere with the cache. I think we should try the existing behavior before thinking about improvements.

We might also consider the ::Matrix input to batch calls.

@pulsipher
Copy link

pulsipher commented Aug 29, 2024

But I don't know that I like it because it would interfere with the cache. I think we should try the existing behavior before thinking about improvements.

This is a notable limitation of having to use memoization which needs the cache. I believe that jump-dev/MathOptInterface.jl#2402 would solve this problem. I think it is intuitive to have a nonlinear operator that isn't tied to particular variable inputs.

The other thing I wonder is how well the memoized nonlinear operators that depend on splatted inputs will perform as the number the of inputs and outputs become larger (say something on the order of 100 or 1000).

@odow
Copy link
Collaborator Author

odow commented Aug 29, 2024

Yip. jump-dev/MathOptInterface.jl#2402 would fix this. But that's a much more complicated issue 😄

The other thing I wonder is how well the memoized nonlinear operators that depend on splatted inputs will perform as the number the of inputs and outputs become larger

Your guess is as good as mine. Probably poorly. But we can look to improve performance once we have some examples.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants