Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: logdensity_and_hessian #65

Closed
DilumAluthge opened this issue Apr 2, 2020 · 9 comments · Fixed by #101
Closed

Feature request: logdensity_and_hessian #65

DilumAluthge opened this issue Apr 2, 2020 · 9 comments · Fixed by #101

Comments

@DilumAluthge
Copy link
Contributor

The logdensity_and_gradient function is very useful for using e.g. ForwardDiff to compute the gradient of the log density function.

Could we also add a new function logdensity_and_hessian that uses ForwardDiff to compute the Hessian of the log density function?

Although, I suppose that, in the process of computing the Hessian, ForwardDiff will also compute and store the gradient. So it would probably make the most sense to add a new function logdensity_and_gradient_and_hessian that returns the value, the gradient, and the Hessian.

@tpapp
Copy link
Owner

tpapp commented Apr 2, 2020

I am happy to support this (incidentally, what's you use case? I am just curious).

But I want to think a bit more about the API. I think I would prefer

struct Derivatives{N} end # FIXME check n ≥ 0 etc
Derivatives(n::Int) = Derivatives{n}()
logdensity(ℓ, x, derivatives = Derivatives(0))

And once we have a breaking API change anyway, I would prefer solve too #56 by returning any object that supports the properties .value, .gradient, and .hessian (each where applicable, when called with the relevant Derivatives(...)). I think more values than 2 is error-prone.

I appreciate suggestions about the API, bikeshedding is welcome before we finalize it.

I agree that I don't think there is a use case for having the value and the hessian but not the gradient.

@DilumAluthge
Copy link
Contributor Author

LogDensityProblems is a convenient way to formulate maximum likelihood problems. If I don't feel like writing down the derivatives by hand, I can use AD to find the MLE using gradient ascent.

So in order to find the MLE, I only need the gradient of my log-likelihood function. But if I also have the Hessian of my log-likelihood function, I can negate it to get the observed information matrix, and I can use that to estimate standard errors (based on the asymptotic normality of the MLE).

@DilumAluthge
Copy link
Contributor Author

I like the API that you have outlined here. People that only need the value would do this:

foo = logdensity(ℓ, x, derivatives = Derivatives(0))
foo.value

People that need the gradient would do:

foo = logdensity(ℓ, x, derivatives = Derivatives(1))
foo.value
foo.gradient

People that need the Hessian would do:

foo = logdensity(ℓ, x, derivatives = Derivatives(2))
foo.value
foo.gradient
foo.hessian

@tpapp
Copy link
Owner

tpapp commented Apr 6, 2020

Thanks for the comments. I will prepare PR soon.

@tpapp
Copy link
Owner

tpapp commented Aug 25, 2022

Just a quick heads-up: this is still on my radar, I just want to experiment with the practical viability of calculating Hessians via AD for medium-sized models (100-1000 parameters) first.

@sethaxen
Copy link

sethaxen commented Jan 6, 2023

I'd also like this. My use case is described in mlcolab/Pathfinder.jl#115. In Pathfinder we look for a MAP estimate using some optimizer, and we'd like to support users passing second-order optimizers.

@tpapp
Copy link
Owner

tpapp commented Jan 6, 2023

Happy to add an API for this, the only thing that was holding this back is AD support for efficient Hessians.

Will do a PR next week.

Also, I am going ahead with logdensity_and_gradient_and_hessian, unless people object. The API outlined above would be breaking, that's for the future. But of course bikeshedding the name is welcome.

@tpapp
Copy link
Owner

tpapp commented Jan 6, 2023

Or maybe logdensity_gradient_and_hessian.

@sethaxen
Copy link

sethaxen commented Jan 6, 2023

Will do a PR next week.

Sounds great!

Or maybe logdensity_gradient_and_hessian.

I prefer this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants