-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature request: logdensity_and_hessian
#65
Comments
I am happy to support this (incidentally, what's you use case? I am just curious). But I want to think a bit more about the API. I think I would prefer struct Derivatives{N} end # FIXME check n ≥ 0 etc
Derivatives(n::Int) = Derivatives{n}()
logdensity(ℓ, x, derivatives = Derivatives(0)) And once we have a breaking API change anyway, I would prefer solve too #56 by returning any object that supports the properties I appreciate suggestions about the API, bikeshedding is welcome before we finalize it. I agree that I don't think there is a use case for having the value and the hessian but not the gradient. |
So in order to find the MLE, I only need the gradient of my log-likelihood function. But if I also have the Hessian of my log-likelihood function, I can negate it to get the observed information matrix, and I can use that to estimate standard errors (based on the asymptotic normality of the MLE). |
I like the API that you have outlined here. People that only need the value would do this: foo = logdensity(ℓ, x, derivatives = Derivatives(0))
foo.value People that need the gradient would do: foo = logdensity(ℓ, x, derivatives = Derivatives(1))
foo.value
foo.gradient People that need the Hessian would do: foo = logdensity(ℓ, x, derivatives = Derivatives(2))
foo.value
foo.gradient
foo.hessian |
Thanks for the comments. I will prepare PR soon. |
Just a quick heads-up: this is still on my radar, I just want to experiment with the practical viability of calculating Hessians via AD for medium-sized models (100-1000 parameters) first. |
I'd also like this. My use case is described in mlcolab/Pathfinder.jl#115. In Pathfinder we look for a MAP estimate using some optimizer, and we'd like to support users passing second-order optimizers. |
Happy to add an API for this, the only thing that was holding this back is AD support for efficient Hessians. Will do a PR next week. Also, I am going ahead with |
Or maybe |
Sounds great!
I prefer this one. |
The
logdensity_and_gradient
function is very useful for using e.g.ForwardDiff
to compute the gradient of the log density function.Could we also add a new function
logdensity_and_hessian
that usesForwardDiff
to compute the Hessian of the log density function?Although, I suppose that, in the process of computing the Hessian,
ForwardDiff
will also compute and store the gradient. So it would probably make the most sense to add a new functionlogdensity_and_gradient_and_hessian
that returns the value, the gradient, and the Hessian.The text was updated successfully, but these errors were encountered: