[GrayBox] add Hessian support #90

odow · 2024-08-28T03:10:07Z

As discussed in #70, it would be very nice to automate the embedding of NNs (and other predictors) as black-box functions that are treated as nonlinear operators. In my research focusing on using smooth NNs in nonlinear optimal control formulations, we have found that treating the NN as an operator that gets its derivatives from the ML environment (e.g., using torch.func from PyTorch) significantly outperforms embedding the NN as algebraic constraints (benchmarking OMLT against using PyNumero's greybox API).

Naturally, JuMP's nonlinear operator API is scalarized, so I am not sure how well it will work for predictors with many inputs and outputs. This definitely motivates the milestone to add vectorized operator support in JuMP.

To which I replied

For black-box outputs, we could automate wrapping @operator and building the appropriate derivatives. And for vector-valued, we could also automate the memoization stuff.

The text was updated successfully, but these errors were encountered:

odow · 2024-08-28T23:11:14Z

I have a prototype, but it needs jump-dev/MathOptInterface.jl#2534 for input=1 models.

odow · 2024-08-29T03:17:38Z

#96 implements most of what we want out of this.

@pulsipher would like the ability to add Hessians to the nonlinear callback

odow · 2024-08-29T03:19:21Z

We also discussed the ability to re-use the operator for different inputs.

But I don't know that I like it because it would interfere with the cache. I think we should try the existing behavior before thinking about improvements.

We might also consider the ::Matrix input to batch calls.

pulsipher · 2024-08-29T15:23:10Z

But I don't know that I like it because it would interfere with the cache. I think we should try the existing behavior before thinking about improvements.

This is a notable limitation of having to use memoization which needs the cache. I believe that jump-dev/MathOptInterface.jl#2402 would solve this problem. I think it is intuitive to have a nonlinear operator that isn't tied to particular variable inputs.

The other thing I wonder is how well the memoized nonlinear operators that depend on splatted inputs will perform as the number the of inputs and outputs become larger (say something on the order of 100 or 1000).

odow · 2024-08-29T22:14:25Z

Yip. jump-dev/MathOptInterface.jl#2402 would fix this. But that's a much more complicated issue 😄

The other thing I wonder is how well the memoized nonlinear operators that depend on splatted inputs will perform as the number the of inputs and outputs become larger

Your guess is as good as mine. Probably poorly. But we can look to improve performance once we have some examples.

odow mentioned this issue Aug 28, 2024

Review feedback #82

Closed

This was referenced Aug 28, 2024

Univariate user-defined functions with the multivariate signature jump-dev/MathOptInterface.jl#2534

Closed

Add GrayBox predictor #96

Merged

odow changed the title ~~Black-box transformations~~ [GrayBox] add Hessian support Aug 29, 2024

pulsipher mentioned this issue Aug 29, 2024

[breaking] return config struct from each layer #67

Closed

odow mentioned this issue Sep 3, 2024

Add Hessian support to GrayBox #100

Merged

odow closed this as completed in #100 Sep 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[GrayBox] add Hessian support #90

[GrayBox] add Hessian support #90

odow commented Aug 28, 2024

odow commented Aug 28, 2024

odow commented Aug 29, 2024

odow commented Aug 29, 2024

pulsipher commented Aug 29, 2024 •

edited

Loading

odow commented Aug 29, 2024

[GrayBox] add Hessian support #90

[GrayBox] add Hessian support #90

Comments

odow commented Aug 28, 2024

odow commented Aug 28, 2024

odow commented Aug 29, 2024

odow commented Aug 29, 2024

pulsipher commented Aug 29, 2024 • edited Loading

odow commented Aug 29, 2024

pulsipher commented Aug 29, 2024 •

edited

Loading