Fixes and improvements to experimental `Gibbs` #2231

torfjelde · 2024-05-18T23:56:20Z

Turns out that there were some subtle bugs present in the impl of Turing.Experimental.Gibbs which is likely part of the reason why we were seeing some strange results here and there.

This PR does the following:

Fixes these issues with, i.e. we're no longer accidentally hitting condition instead of gibbs_condition.
Properly handles initial_params.
More rigorous correctness testing of inference results.

Remaining TODOs:

Make it work for externalsampler (part of the reason why I waited with this PR was it needed some functionality from Fixes to AD backend usage in externalsampler #2223 )

Fix #2230 fix #2234

coveralls · 2024-05-19T01:38:55Z

Pull Request Test Coverage Report for Build 9143463911

Details

0 of 30 (0.0%) changed or added relevant lines in 3 files are covered.
3 unchanged lines in 2 files lost coverage.
Overall coverage remained the same at 0.0%

Changes Missing Coverage	Changed/Added Lines	%
src/mcmc/Inference.jl	1	0.0%
src/mcmc/abstractmcmc.jl	13	0.0%
src/experimental/gibbs.jl	16	0.0%

Files with Coverage Reduction	New Missed Lines	%
src/experimental/gibbs.jl	1	0.0%
src/mcmc/abstractmcmc.jl	2	0.0%

Totals
Change from base Build 9141965383:	0.0%
Covered Lines:	0
Relevant Lines:	1549

💛 - Coveralls

torfjelde · 2024-05-20T09:51:55Z

A few things:

Sampling from gdemo using CSMC in the gibbs sampler is very slow. Removing those tests (requires a large number of iterations to something reasonable).
We're hitting OOM issues when using the SMC samplers. Seems somewhat crazy that this would happen.

capture tail differences + remove subsampling of chains since it doesn't really matter that when we're using aggressive thinning and test statistics based on comparing order stats

torfjelde · 2024-05-20T11:22:25Z

As there have been a few scenarios where we've hit some interesting snags wrt. failures of tests when only looking at "simple" statistics, e.g. mean, I'm trying to use something a bit more prinicpled. Specifically, I've added tests using Anderson-Darling tests (similar to Kolmogorov-Smirnov, but integrating over the entire ECDF instead of just considering the supremum) where we have tests to make sure that the significance level is set in such a way that a) that minor (both additive and mutliplicative) perturbations to the "true" samples are caught, but also b) tests between "true" samples and the samples of interest pass.

… torfjelde/gibbs-new-improv

torfjelde · 2024-06-06T10:57:19Z

Pfft I think the way of testing the marginals here is a really good way to go, and it's almost there, but I'm starting to think that maybe Anderson-Darling is just a bit too strong of a test; it puts particular emphasis on the tails of the distributions, which of course can be a bit of a problem for any kind of MCMC output.

I'm thinking something like a Cramer-von Mises test would be perfect, as it doesn't inflate the differences in the tails of distribution, but is still more suitable than a Kolmogorov-Smirnov test (which only considers the maximum difference between the two distributions). Buuut no implementation of Cramer-von Mises exists in HypothesisTests.jl, so that's a bit annoying (see related issue: JuliaStats/HypothesisTests.jl#201).

Don't have time to implement it myself right now (might be worth just piggy-backing off of scipy's implementation or something to give it a try), but something to keep in mind for future reference.

Note that this "investigation" all started because we've had experiences in the past where just testing means and std isn't really good enough, and so we start comparing quantiles. But if we're comparing quantiles, then we might as well just do a proper ECDF hypothesis test, where we make sure the acceptance threshold is sufficient to capture certain differences.

One possible (though seems lightly hacky) alternative might be to just compare the underlying test-statistics of samples from the inference method to be tested vs. test-statistics from "perturbations" of the "true" samples. E.g. you scale and shift the "true" samples, compute test-statistic, make sure that the test-statistic of samples from inference alg is better than these perturbed ones. Seems a bit hacky but also probably some way of theoretically motivating this.

yebai · 2024-06-12T20:08:49Z

Make it work for externalsampler

@torfjelde, to clarify, the new Gibbs sampler is now working with an external sampler, right? If so, let us resolve the merge conflict and get this PR merged.

… torfjelde/gibbs-new-improv

torfjelde · 2024-07-09T08:51:18Z

Will get to this later today 👍

sunxd3 · 2024-07-09T13:06:54Z

Turing.jl/src/experimental/gibbs.jl

Line 311 in 3d3c944

varinfos_new = DynamicPPL.setindex!!(varinfos, vi_base, 1)

(can't seem to review lines that are not changed, so copied permlink)
maybe I'm reading this wrong, this feels like we are just throwing away varinfos[1]. Should there be an extra line that merge varinfos[1] into vi_base and then replace it with vi_base?

If this needs change, then also

Turing.jl/src/experimental/gibbs.jl

Line 357 in 3d3c944

varinfos_new = DynamicPPL.setindex!!(

torfjelde · 2024-07-09T19:14:04Z

A issue is that the type of f::LogDensityProblemsAD.ADGradientWrapper will be thrown away, and a new wrapper will be created according to sampler's AD config. As far as I can tell, this is what we are doing right now anyway, but may worth some consideration.

So I did indeed consider dispatching on the adtype, but it's not quite enough for what we want to achieve here. Yes, it works with ExternalSampler, but we wanted to use this as a playground to enable Gibbs even outside of Turing.jl, in which case you're no longer working with an ExternalSampler.

But I think the reasoning is fair, though it would be nice if, say, the adtype used in a gradient wrapper would also be stored with the gradient wrapper itself or something, so you could retrive this from the gradient wrapper itself 😕

torfjelde · 2024-07-09T19:15:38Z

hould there be an extra line that merge varinfos[1] into vi_base and then replace it with vi_base?

Very nice catch! From a first glance, it indeed looks like we're missing a merge 👍

yebai · 2024-07-10T17:44:38Z

@torfjelde, please address @sunxd3’s comment. Then, let’s merge this PR as-is since it contains bug fixes. The remaining issues can be addressed via separate PRs.

torfjelde · 2024-07-10T17:55:23Z

Done 👍

The bug was just in the initial step, so didn't really make much of a difference, but added the merge now.

codecov · 2024-07-10T18:01:13Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 85.82%. Comparing base (142dab3) to head (d40d82b).

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #2231      +/-   ##
==========================================
+ Coverage   83.09%   85.82%   +2.73%     
==========================================
  Files          24       24              
  Lines        1591     1623      +32     
==========================================
+ Hits         1322     1393      +71     
+ Misses        269      230      -39

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

torfjelde · 2024-07-15T07:29:24Z

CI is failliing because of check_model issues on Julia 1.7 that somehow weren't caught before. Have opened PR to address this in DPPL.

… `ExternalSampler`

yebai · 2024-07-15T20:11:38Z

@torfjelde to clarify, do we still need tpapp/LogDensityProblemsAD.jl#33 after 7c4368e?

coveralls · 2024-07-15T20:41:15Z

Pull Request Test Coverage Report for Build 9946003565

Details

0 of 42 (0.0%) changed or added relevant lines in 3 files are covered.
1259 unchanged lines in 21 files lost coverage.
Overall coverage increased (+2.7%) to 86.041%

Changes Missing Coverage	Changed/Added Lines	%
src/mcmc/Inference.jl	1	0.0%
src/experimental/gibbs.jl	17	0.0%
src/mcmc/abstractmcmc.jl	24	0.0%

Files with Coverage Reduction	New Missed Lines	%
src/variational/VariationalInference.jl	4	0.0%
src/Turing.jl	9	0.0%
src/mcmc/gibbs_conditional.jl	12	0.0%
src/mcmc/is.jl	16	0.0%
src/stdlib/RandomMeasures.jl	22	0.0%
src/essential/container.jl	27	0.0%
ext/TuringDynamicHMCExt.jl	29	0.0%
src/mcmc/abstractmcmc.jl	34	0.0%
src/mcmc/emcee.jl	47	0.0%
ext/TuringOptimExt.jl	50	0.0%

Totals
Change from base Build 9872415851:	2.7%
Covered Lines:	1393
Relevant Lines:	1619

💛 - Coveralls

sunxd3 · 2024-07-16T07:23:04Z

Chiming in before @torfjelde's reply.

My judgement here is that we don't need the API (i.e., replace_l), as all it really does is

Turing.jl/src/mcmc/abstractmcmc.jl

Line 51 in d40d82b

return LogDensityProblemsAD.ADgradient(adtype, setmodel(parent(f), model))

That being said, being able to dispatch on particular AdGradientWrapper subtype is better, because there might be different keyword arguments, e.g. https://github.com/tpapp/LogDensityProblemsAD.jl/blob/449e5661bc2667f7bef061e148a6ea5526cbb427/ext/LogDensityProblemsADForwardDiffExt.jl#L98-L101 to ADgradient we might want to be able to control.

Although for personal taste, I don't think we need to get ahead of ourselves right now.

torfjelde · 2024-07-16T07:51:55Z

@torfjelde to clarify, do we still need tpapp/LogDensityProblemsAD.jl#33 after 7c4368e?

We no longer need it, no. But it would make things cleaner

As @sunxd3 said, we've basically just implemented that thing ourselves here

sunxd3 · 2024-07-16T08:00:57Z

This PR looks ready to merge

sunxd3 · 2024-07-16T09:13:23Z

@torfjelde @yebai are we ready to release?

torfjelde · 2024-07-16T09:48:10Z

Yeah this should be good to go now:)

Red-Portal · 2024-07-16T19:28:23Z

Great I'll look into incorporating the slice samplers using the new interface!

torfjelde added 7 commits April 23, 2024 10:03

moved new Gibbs tests all into a single block

0b2279f

initial work on making Gibbs work with externalsampler

dcad548

Merge branch 'master' into torfjelde/gibbs-new-improv

8e2d7be

removed references to Setfield.jl

4a609cb

fixed crucial bug in experimental Gibbs sampler

fc21894

added ground-truth comparison for Gibbs sampler on demo models

9910962

added convenience method for performing two sample KS test

b3a4692

use thinning to avoid OOM issues

3e17efc

torfjelde added 3 commits May 20, 2024 11:33

removed incredibly slow testset that didn't really add much

429fc8f

removed now-redundant testset

f6af20e

use Anderson-Darling test instead of Kolomogorov-Smirnov to better

065eef6

capture tail differences + remove subsampling of chains since it doesn't really matter that when we're using aggressive thinning and test statistics based on comparing order stats

torfjelde mentioned this pull request May 23, 2024

Undeterministic test failure #2234

Closed

torfjelde added 8 commits June 4, 2024 21:32

Merge branch 'master' into torfjelde/gibbs-new-improv

c2d23e5

more work on testing

99f28f9

Merge branch 'master' into torfjelde/gibbs-new-improv

6df1ccc

fixed tests

b6a907e

Merge remote-tracking branch 'origin/torfjelde/gibbs-new-improv' into…

a4e223e

… torfjelde/gibbs-new-improv

make failures of two_sample_ad_tests a bit more informative

e1e7386

make failrues of two_sample_ad_test produce more informative logs

be1ec7f

additional information upon two_sample_ad_test failure

5f36446

rename two_sample_ad_test to two_sample_test and use KS test instead

3be8f8b

torfjelde added 4 commits June 16, 2024 22:22

added minor test for externalsampler usage

dbaf447

also test AdvancedHMC samplers with Gibbs

f44c407

forgot to add updates to src/mcmc/abstractmcmc.jl in previous commits

dd86cfa

Merge remote-tracking branch 'origin/torfjelde/gibbs-new-improv' into…

4160577

… torfjelde/gibbs-new-improv

fixed missing merge in initial step for experimental Gibbs

e1f1a0e

yebai approved these changes Jul 10, 2024

View reviewed changes

sunxd3 mentioned this pull request Jul 12, 2024

Add some interface functions to support the new Gibbs sampler in Turing TuringLang/AbstractMCMC.jl#144

Closed

devmotion mentioned this pull request Jul 14, 2024

Add interface functions to allow replacing the log density function and replacing AD wrapper type tpapp/LogDensityProblemsAD.jl#33

Closed

torfjelde mentioned this pull request Jul 15, 2024

Fix for check_model on Julia <1.9 TuringLang/DynamicPPL.jl#631

Merged

torfjelde added 2 commits July 15, 2024 08:32

Always reconstruct ADGradientWrapper using the adype available in…

7c4368e

… `ExternalSampler`

Test Gibbs with different adtype in externalsampler to ensure that works

06357c6

yebai added 3 commits July 15, 2024 21:12

Update Project.toml

02f9fad

Update Project.toml

30ab9e0

Merge branch 'master' into torfjelde/gibbs-new-improv

d40d82b

yebai merged commit 29a1342 into master Jul 16, 2024
60 checks passed

yebai deleted the torfjelde/gibbs-new-improv branch July 16, 2024 08:52

torfjelde mentioned this pull request Jul 18, 2024

Missing / incorrect impl of Base.parent for LogDensityProblemsAD.jl interface compintell/Mooncake.jl#197

Closed

mhauru mentioned this pull request Sep 26, 2024

Replace old Gibbs sampler with the experimental one. #2328

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixes and improvements to experimental `Gibbs` #2231

Fixes and improvements to experimental `Gibbs` #2231

torfjelde commented May 18, 2024 •

edited by yebai

Loading

coveralls commented May 19, 2024

torfjelde commented May 20, 2024

torfjelde commented May 20, 2024

torfjelde commented Jun 6, 2024

yebai commented Jun 12, 2024

torfjelde commented Jul 9, 2024

sunxd3 commented Jul 9, 2024

torfjelde commented Jul 9, 2024

torfjelde commented Jul 9, 2024

yebai commented Jul 10, 2024 •

edited

Loading

torfjelde commented Jul 10, 2024

codecov bot commented Jul 10, 2024 •

edited

Loading

torfjelde commented Jul 15, 2024

yebai commented Jul 15, 2024

coveralls commented Jul 15, 2024 •

edited

Loading

sunxd3 commented Jul 16, 2024

torfjelde commented Jul 16, 2024 •

edited

Loading

sunxd3 commented Jul 16, 2024

sunxd3 commented Jul 16, 2024

torfjelde commented Jul 16, 2024

Red-Portal commented Jul 16, 2024

Fixes and improvements to experimental Gibbs #2231

Fixes and improvements to experimental Gibbs #2231

Conversation

torfjelde commented May 18, 2024 • edited by yebai Loading

coveralls commented May 19, 2024

Pull Request Test Coverage Report for Build 9143463911

Details

💛 - Coveralls

torfjelde commented May 20, 2024

torfjelde commented May 20, 2024

torfjelde commented Jun 6, 2024

yebai commented Jun 12, 2024

torfjelde commented Jul 9, 2024

sunxd3 commented Jul 9, 2024

torfjelde commented Jul 9, 2024

torfjelde commented Jul 9, 2024

yebai commented Jul 10, 2024 • edited Loading

torfjelde commented Jul 10, 2024

codecov bot commented Jul 10, 2024 • edited Loading

Codecov Report

torfjelde commented Jul 15, 2024

yebai commented Jul 15, 2024

coveralls commented Jul 15, 2024 • edited Loading

Pull Request Test Coverage Report for Build 9946003565

Details

💛 - Coveralls

sunxd3 commented Jul 16, 2024

torfjelde commented Jul 16, 2024 • edited Loading

sunxd3 commented Jul 16, 2024

sunxd3 commented Jul 16, 2024

torfjelde commented Jul 16, 2024

Red-Portal commented Jul 16, 2024

Fixes and improvements to experimental `Gibbs` #2231

Fixes and improvements to experimental `Gibbs` #2231

torfjelde commented May 18, 2024 •

edited by yebai

Loading

yebai commented Jul 10, 2024 •

edited

Loading

codecov bot commented Jul 10, 2024 •

edited

Loading

coveralls commented Jul 15, 2024 •

edited

Loading

torfjelde commented Jul 16, 2024 •

edited

Loading