Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use LogDensityProblems.jl #301

Merged
merged 24 commits into from
Dec 25, 2022
Merged

Use LogDensityProblems.jl #301

merged 24 commits into from
Dec 25, 2022

Conversation

torfjelde
Copy link
Member

@torfjelde torfjelde commented Nov 17, 2022

Turing.jl has somewhat "recently" started using LogDensityProblems.jl under the hood to simply all the handling of AD. This PR does the same for AdvancedHMC.jl, which in turn means that we can just plug a Turing.LogDensityFunction into AHMC and sample, in theory making the HMC implementation, etc. in Turing.jl redundant + we get additional AD-backends for free, e.g. Enzyme.

IMO we might even want to consider making a bridge-package between AbstractMCMC.jl and LogDensityProblems.jl simply containing this LogDensityModel.jl, given how this implementation would be very useful in other packages implementing samplers, e.g. AdvancedMH.jl. Thoughts on this @devmotion @yebai @xukai92 ?

A couple of (fixable) caveats:

EDIT: This now depends on TuringLang/AbstractMCMC.jl#110.

@tpapp
Copy link

tpapp commented Nov 17, 2022

@torfjelde: similar should work fine IMO. PRs to LogDensityProblems are most welcome and I will review them quickly.

@yebai
Copy link
Member

yebai commented Nov 17, 2022

I fully support this idea. Also happy to consider making AbstractMCMC / AbstractPPL dependent on LogDensityProblems.jl.

@yebai
Copy link
Member

yebai commented Nov 17, 2022

@tpapp would you keep LogDensityProblems's dependency lightweight in the long term?

README.md Outdated Show resolved Hide resolved
@torfjelde
Copy link
Member Author

I fully support this idea. Also happy to consider making AbstractMCMC / AbstractPPL dependent on LogDensityProblems.jl.

The main argument against this is that LogDensityProblems.jl uses Requires.jl, and the idea behind AbstractMCMC is that it should be so lightweight that you should be able to just depend on it. This is why I'm in favour of a bridge-package (though I'm also not a big fan of the growing number of packages, but seems like it'll be difficult to find an alternative here).

README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
@yebai
Copy link
Member

yebai commented Nov 17, 2022

The main argument against this is that LogDensityProblems.jl uses Requires.jl, and the idea behind AbstractMCMC is that it should be so lightweight that you should be able to just depend on it. This is why I'm in favour of a bridge-package (though I'm also not a big fan of the growing number of packages, but seems like it'll be difficult to find an alternative here).

It might be possible to move some core types and APIs from LogDensityProblems into AbstractMCMC, then make LogDensityProblems dependent on AbstractMCMC. But that requires @tpapp's support of course.

@devmotion
Copy link
Member

devmotion commented Nov 17, 2022

LogDensityProblems is much more lightweight than AbstractMCMC: The former depends on 4 packages (https://juliahub.com/ui/Packages/LogDensityProblems/wTQV3/1.0.2?page=1) whereas the latter has 7 direct and 35 indirect dependencies (https://juliahub.com/ui/Packages/AbstractMCMC/5x8OI/4.1.3?page=1). So I don't think it makes sense to make LogDensityProblems dependent on AbstractMCMC, the other way around seems more reasonable.

AFAICT most of the dependencies are pulled in by BangBang and Transducers.

@devmotion
Copy link
Member

Already in TuringLang/Turing.jl#1877 my plan was to move Turing.LogDensityFunction and Turing-specific AD stuff to separate packages. I just haven't had time to push it but I think it should not be part of Turing in the long run. For instance, to avoid redefining the AD parts and density functions etc. in other packages such as Bijectors, AdvancedVI etc. (and AdvancedHMC I assume).

src/AdvancedHMC.jl Outdated Show resolved Hide resolved
@tpapp
Copy link

tpapp commented Nov 17, 2022

@yebai: Yes, LogDensityProblems will be kept very light. I am not planning to add any more dependencies. It currently uses Requires.jl for loading backend-specific code for AD, I am hoping to replace it with AbstractDifferentiation.jl when that matures.

@yebai
Copy link
Member

yebai commented Nov 17, 2022

@tpapp many thanks for confirming. Yes, I am slightly worried about depending on Requires.

@devmotion
Copy link
Member

I also think that AbstractDifferentation could resolve this but to me it seems still not mature enough. From the ouside (maybe someone knows it better) it seems that unfortunately nobody is pushing it right now and nobody is working on fixing the different issues it has. Possibly one reason for that state could be that it might require some design changes/decisions, and at least to me it's a bit unclear who's deciding what to do in the end.

@torfjelde
Copy link
Member Author

torfjelde commented Nov 22, 2022

I honestly think at this point I'd be happy to add the implementation of LogDensityModel from this PR to AbstractMCMC .jland depend on LogDensityProblems.jl. We've had plans for ages of adding some "simple"/default models, etc. to AbstractMCMC.jl which could be used by many of the sampler packages without them having to define their own models, but these plans/discussions usually grind to a halt because we're never quite satisfied with the design. I think LogDensityProblems.jl does a good enough job for it to be worth just going with that approach for now. Then packages such as AdvancedHMC.jl and AdvancedMH.jl can both just use the LogDensityProblems.jl interface to interact with the model, which in turn has some nice implications:

  1. Anything that implements LogDensityProblems.logdensity, LogDensityProblems.dimension, and LogDensityProblems.capabilties can be used, i.e. the user can use the same model across the different packages.
  2. We can move the non-AD stuff for Turing.LogDensityFunction to DynamicPPL.jl.
  3. We can easily apply samplers not from Turing.jl to models from DynamicPPL.jl without any further changes (if the samplers start making use of the LogDensityProblems-interface).

(3) was actually my original intention for this PR. Using AdvancedHMC.jl directly gives me much better control of the sampler than the implementation of HMC present in Turing.jl, and after this PR I can just make a Turing.LogDensityFunction from my model and then AdvancedHMC.jl just works.

EDIT: The drawback is of course that we're now introducing dep on Requires.jl to AbstractMCMC.jl 😕

@yebai
Copy link
Member

yebai commented Nov 22, 2022

@torfjelde that sounds like a sensible plan. I am happy to make AbstractMCMC depend on LogDensityProblems.

For the slightly longer term, I hope that LogDensityProblems can separate the AD stuff into a separate package so LogDensityProblems is truly lightweight.

@tpapp
Copy link

tpapp commented Nov 22, 2022

@yebai: Yes, that is the plan. I created an issue for further discussion/suggestions about moving AD backends out of the package: tpapp/LogDensityProblems.jl#97

@yebai
Copy link
Member

yebai commented Nov 22, 2022

@tpapp my impression is that AbstractDifferentiation will likely take a very long time to mature if that happens at all.

Would you be happy if @torfjelde helped to separate the AD code in LogDensityProblems into a new package now-ish? This would allow AbstractMCMC to depend on LogDensityProblems without introducing a dependency on Requires which we tried hard to avoid.

@devmotion
Copy link
Member

AdvancedHMC, Turing, etc. all depend on Requires anyway, don't they? But maybe it would be a good idea anyway to move them into separate (sub-)packages.

I think one other point is that it would be nice to be able to define AD backend types as well (maybe could be done even without hiding it behind Requires), similar to the backend types in AbstractDifferentiation and Turing: That would allow to pass the AD type and its option around in a somewhat clean and standardized way (cf. #301 (comment)).

@tpapp
Copy link

tpapp commented Nov 22, 2022

I separated the AD backend code to https://github.com/tpapp/LogDensityProblemsAD.jl. Once it registers, I will deprecate LogDensityProblems.ADgradient and remove code from there and the latter package will no longer depend on Requires.jl.

Further suggestions for AD backends are most welcome, but please open an issue there so they don't get lost.

@tpapp
Copy link

tpapp commented Dec 7, 2022

I have now factored out all the AD code from LogDensityProblems. A new release (2.0) will happen as soon as the registry automerges (thanks for your patience, this has been a busy time for me so I only got around to it now).

I plan to use weak dependencies instead of Requires.jl in LogDensityProblemsAD. Follow tpapp/LogDensityProblemsAD.jl#2 if you want to keep an eye on this.

src/AdvancedHMC.jl Outdated Show resolved Hide resolved
src/AdvancedHMC.jl Outdated Show resolved Hide resolved
src/AdvancedHMC.jl Outdated Show resolved Hide resolved
test/adaptation.jl Outdated Show resolved Hide resolved
test/adaptation.jl Outdated Show resolved Hide resolved
test/adaptation.jl Outdated Show resolved Hide resolved
test/common.jl Show resolved Hide resolved
test/common.jl Show resolved Hide resolved
test/demo.jl Outdated Show resolved Hide resolved
test/integrator.jl Outdated Show resolved Hide resolved
@torfjelde
Copy link
Member Author

torfjelde commented Dec 23, 2022

This is causing issues: https://github.com/tpapp/LogDensityProblemsAD.jl/blob/79e245e1bb8e904087af9de284e2796e51eb55c4/src/AD_ForwardDiff.jl#L21-L23

LogDensityProblemsAD.ADgradient requires ForwardDiff.GradientConfig and so the type of the dual vector will be tied to the density problem.

EDIT: Just for the record, this also breaks CUDA compat, etc.; it's not limited to ComponentArray.

@devmotion
Copy link
Member

Where/why is that problematic? There are no issues with that design in Turing (in fact, being able to specify the GradientConfig is very useful for working with custom tags: https://github.com/TuringLang/Turing.jl/blob/61b06f642872522ca14133bb5c86fc603841dbba/src/essential/ad.jl#L89-L100).

@torfjelde
Copy link
Member Author

It's not necessarily problematic, but it does mean this change will be breaking. Currently AHMC won't require specification of the gradient config to work with something like ComponentArray.

test/demo.jl Outdated Show resolved Hide resolved
src/AdvancedHMC.jl Outdated Show resolved Hide resolved
src/AdvancedHMC.jl Outdated Show resolved Hide resolved
src/AdvancedHMC.jl Outdated Show resolved Hide resolved
test/demo.jl Outdated Show resolved Hide resolved
@torfjelde
Copy link
Member Author

This PR should now be good to go as soon as tests pass:)

src/AdvancedHMC.jl Outdated Show resolved Hide resolved
src/AdvancedHMC.jl Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants