-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use LogDensityProblems.jl #301
Conversation
@torfjelde: similar should work fine IMO. PRs to LogDensityProblems are most welcome and I will review them quickly. |
I fully support this idea. Also happy to consider making |
@tpapp would you keep |
The main argument against this is that LogDensityProblems.jl uses Requires.jl, and the idea behind AbstractMCMC is that it should be so lightweight that you should be able to just depend on it. This is why I'm in favour of a bridge-package (though I'm also not a big fan of the growing number of packages, but seems like it'll be difficult to find an alternative here). |
It might be possible to move some core types and APIs from |
Co-authored-by: Hong Ge <[email protected]>
LogDensityProblems is much more lightweight than AbstractMCMC: The former depends on 4 packages (https://juliahub.com/ui/Packages/LogDensityProblems/wTQV3/1.0.2?page=1) whereas the latter has 7 direct and 35 indirect dependencies (https://juliahub.com/ui/Packages/AbstractMCMC/5x8OI/4.1.3?page=1). So I don't think it makes sense to make LogDensityProblems dependent on AbstractMCMC, the other way around seems more reasonable. AFAICT most of the dependencies are pulled in by BangBang and Transducers. |
Already in TuringLang/Turing.jl#1877 my plan was to move |
@yebai: Yes, LogDensityProblems will be kept very light. I am not planning to add any more dependencies. It currently uses Requires.jl for loading backend-specific code for AD, I am hoping to replace it with AbstractDifferentiation.jl when that matures. |
@tpapp many thanks for confirming. Yes, I am slightly worried about depending on |
I also think that AbstractDifferentation could resolve this but to me it seems still not mature enough. From the ouside (maybe someone knows it better) it seems that unfortunately nobody is pushing it right now and nobody is working on fixing the different issues it has. Possibly one reason for that state could be that it might require some design changes/decisions, and at least to me it's a bit unclear who's deciding what to do in the end. |
I honestly think at this point I'd be happy to add the implementation of
(3) was actually my original intention for this PR. Using AdvancedHMC.jl directly gives me much better control of the sampler than the implementation of EDIT: The drawback is of course that we're now introducing dep on Requires.jl to AbstractMCMC.jl 😕 |
@torfjelde that sounds like a sensible plan. I am happy to make For the slightly longer term, I hope that |
@yebai: Yes, that is the plan. I created an issue for further discussion/suggestions about moving AD backends out of the package: tpapp/LogDensityProblems.jl#97 |
@tpapp my impression is that Would you be happy if @torfjelde helped to separate the AD code in |
AdvancedHMC, Turing, etc. all depend on Requires anyway, don't they? But maybe it would be a good idea anyway to move them into separate (sub-)packages. I think one other point is that it would be nice to be able to define AD backend types as well (maybe could be done even without hiding it behind Requires), similar to the backend types in AbstractDifferentiation and Turing: That would allow to pass the AD type and its option around in a somewhat clean and standardized way (cf. #301 (comment)). |
I separated the AD backend code to https://github.com/tpapp/LogDensityProblemsAD.jl. Once it registers, I will deprecate Further suggestions for AD backends are most welcome, but please open an issue there so they don't get lost. |
I have now factored out all the AD code from LogDensityProblems. A new release (2.0) will happen as soon as the registry automerges (thanks for your patience, this has been a busy time for me so I only got around to it now). I plan to use weak dependencies instead of Requires.jl in LogDensityProblemsAD. Follow tpapp/LogDensityProblemsAD.jl#2 if you want to keep an eye on this. |
Co-authored-by: David Widmann <[email protected]>
This is causing issues: https://github.com/tpapp/LogDensityProblemsAD.jl/blob/79e245e1bb8e904087af9de284e2796e51eb55c4/src/AD_ForwardDiff.jl#L21-L23
EDIT: Just for the record, this also breaks CUDA compat, etc.; it's not limited to |
Where/why is that problematic? There are no issues with that design in Turing (in fact, being able to specify the GradientConfig is very useful for working with custom tags: https://github.com/TuringLang/Turing.jl/blob/61b06f642872522ca14133bb5c86fc603841dbba/src/essential/ad.jl#L89-L100). |
It's not necessarily problematic, but it does mean this change will be breaking. Currently AHMC won't require specification of the gradient config to work with something like ComponentArray. |
This PR should now be good to go as soon as tests pass:) |
Co-authored-by: David Widmann <[email protected]>
Turing.jl has somewhat "recently" started using
LogDensityProblems.jl
under the hood to simply all the handling of AD. This PR does the same for AdvancedHMC.jl, which in turn means that we can just plug aTuring.LogDensityFunction
into AHMC and sample, in theory making theHMC
implementation, etc. in Turing.jl redundant + we get additional AD-backends for free, e.g. Enzyme.IMO we might even want to consider making a bridge-package between
AbstractMCMC.jl
andLogDensityProblems.jl
simply containing thisLogDensityModel.jl
, given how this implementation would be very useful in other packages implementing samplers, e.g.AdvancedMH.jl
. Thoughts on this @devmotion @yebai @xukai92 ?A couple of (fixable) caveats:
ForwardDiff
implementation forAbstractMatrix
.DiffResults.jl
helpers inLogDensityProblems.jl
seems to be me a bit overly restrictive, e.g. https://github.com/tpapp/LogDensityProblems.jl/blob/a6a570751d0ee79345e92efd88062a0e6d59ef1b/src/DiffResults_helpers.jl#L14-L18 I believe will convert aComponentVector
intoVector
, thus dropping the named dimensions. (@tpapp is there a reason why we can't just usesimilar(x)
?)EDIT: This now depends on TuringLang/AbstractMCMC.jl#110.