Skip to content

Commit

Permalink
Add more extensive usage docs (#63)
Browse files Browse the repository at this point in the history
* Add Quickstart tutorial

* Add dedicated library pages

* Update index page

* Fix misformatted equation

* Expand readme title

* Document initializing HMC

* Document Turing integration

* Fix Turing URL

* Documentation improvements

* Add dependencies and compats

* Increment version number

* Fix compat entry

* Document building Chains

* Fix number of draws
  • Loading branch information
sethaxen authored May 4, 2022
1 parent 9ec7e6b commit 8c2d00e
Show file tree
Hide file tree
Showing 13 changed files with 637 additions and 166 deletions.
2 changes: 1 addition & 1 deletion Project.toml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
name = "Pathfinder"
uuid = "b1d3bc72-d0e7-4279-b92f-7fa5d6d2d454"
authors = ["Seth Axen <[email protected]> and contributors"]
version = "0.4.1"
version = "0.4.2"

[deps]
AbstractDifferentiation = "c29ec348-61ec-40c8-8164-b8c60e9d9f3d"
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Pathfinder
# Pathfinder.jl: Parallel quasi-Newton variational inference

[![Stable](https://img.shields.io/badge/docs-stable-blue.svg)](https://sethaxen.github.io/Pathfinder.jl/stable)
[![Dev](https://img.shields.io/badge/docs-dev-blue.svg)](https://sethaxen.github.io/Pathfinder.jl/dev)
Expand Down
18 changes: 16 additions & 2 deletions docs/Project.toml
Original file line number Diff line number Diff line change
@@ -1,11 +1,25 @@
[deps]
AdvancedHMC = "0bf59076-c3b1-5ca4-86bd-e02cd72cde3d"
Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
DynamicHMC = "bbc10e6e-7c05-544b-b16e-64fede858acb"
ForwardDiff = "f6369f11-7733-5829-9624-2563aa707210"
LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"
LogDensityProblems = "6fdf6af0-433a-55f7-b3ed-c6c6e0b8df7c"
Pathfinder = "b1d3bc72-d0e7-4279-b92f-7fa5d6d2d454"
Plots = "91a5bcdd-55d7-5caf-9e0b-520d859cae80"
Printf = "de0858da-6303-5e67-8744-51eddeeeb8d7"
StatsFuns = "4c63d2b9-4356-54db-8cca-17b64c39e42c"
StatsPlots = "f3b207a7-027a-5e70-b257-86293d7955fd"
TransformVariables = "84d833dd-6860-57f9-a1a7-6da5db126cff"
Turing = "fce5fe82-541a-59a6-adf8-730c64b5f9a0"

[compat]
AdvancedHMC = "0.3"
Documenter = "0.27"
DynamicHMC = "3"
ForwardDiff = "0.10"
LogDensityProblems = "0.11"
Pathfinder = "0.4"
Plots = "1"
StatsFuns = "0.9"
StatsPlots = "0.14"
TransformVariables = "0.6"
Turing = "0.21"
11 changes: 9 additions & 2 deletions docs/make.jl
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,15 @@ makedocs(;
),
pages=[
"Home" => "index.md",
"Single-path Pathfinder" => "pathfinder.md",
"Multi-path Pathfinder" => "multipathfinder.md",
"Library" => [
"Public" => "lib/public.md",
"Internals" => "lib/internals.md",
],
"Examples" => [
"Quickstart" => "examples/quickstart.md",
"Initializing HMC" => "examples/initializing-hmc.md",
"Turing usage" => "examples/turing.md",
]
],
)

Expand Down
239 changes: 239 additions & 0 deletions docs/src/examples/initializing-hmc.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,239 @@
# Initializing HMC with Pathfinder

## The MCMC warm-up phase

When using MCMC to draw samples from some target distribution, there is often a lengthy warm-up phase with 2 phases:
1. converge to the _typical set_ (the region of the target distribution where the bulk of the probability mass is located)
2. adapt any tunable parameters of the MCMC sampler (optional)

While (1) often happens fairly quickly, (2) usually requires a lengthy exploration of the typical set to iteratively adapt parameters suitable for further exploration.
An example is the widely used windowed adaptation scheme of Hamiltonian Monte Carlo (HMC) in Stan, where a step size and positive definite metric (aka mass matrix) are adapted.[^1]
For posteriors with complex geometry, the adaptation phase can require many evaluations of the gradient of the log density function of the target distribution.

Pathfinder can be used to initialize MCMC, and in particular HMC, in 3 ways:
1. Initialize MCMC from one of Pathfinder's draws (replace phase 1 of the warm-up).
2. Initialize an HMC metric adaptation from the inverse of the covariance of the multivariate normal approximation (replace the first window of phase 2 of the warm-up).
3. Use the inverse of the covariance as the metric without further adaptation (replace phase 2 of the warm-up).

This tutorial demonstrates all three approaches with [DynamicHMC.jl](https://tamaspapp.eu/DynamicHMC.jl/stable/) and [AdvancedHMC.jl](https://github.com/TuringLang/AdvancedHMC.jl).
Both of these packages have standalone implementations of adaptive HMC (aka NUTS) and can be used independently of any probabilistic programming language (PPL).
Both the [Turing](https://turing.ml/stable/) and [Soss](https://github.com/cscherrer/Soss.jl) PPLs have some DynamicHMC integration, while Turing also integrates with AdvancedHMC.

For demonstration purposes, we'll use the following dummy data:

```@example 1
using LinearAlgebra, Pathfinder, Random, StatsFuns, StatsPlots
Random.seed!(91)
x = 0:0.01:1
y = @. sin(10x) + randn() * 0.2 + x
scatter(x, y; xlabel="x", ylabel="y", legend=false, msw=0, ms=2)
```

We'll fit this using a simple polynomial regression model:

```math
\begin{aligned}
\sigma &\sim \text{Half-Normal}(\mu=0, \sigma=1)\\
\alpha, \beta_j &\sim \mathrm{Normal}(\mu=0, \sigma=1)\\
\hat{y}_i &= \alpha + \sum_{j=1}^J x_i^j \beta_j\\
y_i &\sim \mathrm{Normal}(\mu=\hat{y}_i, \sigma=\sigma)
\end{aligned}
```

We just need to implement the log-density function of the resulting posterior.

```@example 1
struct RegressionProblem{X,Z,Y}
x::X
J::Int
z::Z
y::Y
end
function RegressionProblem(x, J, y)
z = x .* (1:J)'
return RegressionProblem(x, J, z, y)
end
function (prob::RegressionProblem)(θ)
σ = θ.σ
α = θ.α
β = θ.β
z = prob.z
y = prob.y
lp = normlogpdf(σ) + logtwo
lp += normlogpdf(α)
lp += sum(normlogpdf, β)
y_hat = muladd(z, β, α)
lp += sum(eachindex(y_hat, y)) do i
return normlogpdf(y_hat[i], σ, y[i])
end
return lp
end
J = 3
dim = J + 2
model = RegressionProblem(x, J, y)
ndraws = 1_000;
nothing # hide
```

## DynamicHMC.jl

To use DynamicHMC, we first need to transform our model to an unconstrained space using [TransformVariables.jl](https://tamaspapp.eu/TransformVariables.jl/stable/) and wrap it in a type that implements the [LogDensityProblems.jl](https://github.com/tpapp/LogDensityProblems.jl) interface:

```@example 1
using DynamicHMC, LogDensityProblems, TransformVariables
transform = as((σ=asℝ₊, α=asℝ, β=as(Array, J)))
P = TransformedLogDensity(transform, model)
∇P = ADgradient(:ForwardDiff, P)
```

Pathfinder, on the other hand, expects a log-density function:

```@example 1
logp(x) = LogDensityProblems.logdensity(P, x)
∇logp(x) = LogDensityProblems.logdensity_and_gradient(∇P, x)[2]
result_pf = pathfinder(logp, ∇logp; dim)
```

```@example 1
init_params = result_pf.draws[:, 1]
```

```@example 1
inv_metric = result_pf.fit_distribution.Σ
```

### Initializing from Pathfinder's draws

Here we just need to pass one of the draws as the initial point `q`:

```@example 1
result_dhmc1 = mcmc_with_warmup(
Random.GLOBAL_RNG,
∇P,
ndraws;
initialization=(; q=init_params),
reporter=NoProgressReport(),
)
```

### Initializing metric adaptation from Pathfinder's estimate

To start with Pathfinder's inverse metric estimate, we just need to initialize a `GaussianKineticEnergy` object with it as input:

```@example 1
result_dhmc2 = mcmc_with_warmup(
Random.GLOBAL_RNG,
∇P,
ndraws;
initialization=(; q=init_params, κ=GaussianKineticEnergy(inv_metric)),
warmup_stages=default_warmup_stages(; M=Symmetric),
reporter=NoProgressReport(),
)
```

We also specified that DynamicHMC should tune a dense `Symmetric` matrix.
However, we could also have requested a `Diagonal` metric.

### Use Pathfinder's metric estimate for sampling

To turn off metric adaptation entirely and use Pathfinder's estimate, we just set the number and size of the metric adaptation windows to 0.

```@example 1
result_dhmc3 = mcmc_with_warmup(
Random.GLOBAL_RNG,
∇P,
ndraws;
initialization=(; q=init_params, κ=GaussianKineticEnergy(inv_metric)),
warmup_stages=default_warmup_stages(; middle_steps=0, doubling_stages=0),
reporter=NoProgressReport(),
)
```

## AdvancedHMC.jl

Similar to Pathfinder, AdvancedHMC works with an unconstrained log density function and its gradient.
We'll just use the `logp` we already created above.

```@example 1
using AdvancedHMC, ForwardDiff
nadapts = 500;
nothing # hide
```

### Initializing from Pathfinder's draws

```@example 1
metric = DiagEuclideanMetric(dim)
hamiltonian = Hamiltonian(metric, logp, ForwardDiff)
ϵ = find_good_stepsize(hamiltonian, init_params)
integrator = Leapfrog(ϵ)
proposal = NUTS{MultinomialTS,GeneralisedNoUTurn}(integrator)
adaptor = StepSizeAdaptor(0.8, integrator)
samples_ahmc1, stats_ahmc1 = sample(
hamiltonian,
proposal,
init_params,
ndraws + nadapts,
adaptor,
nadapts;
drop_warmup=true,
progress=false,
)
```

### Initializing metric adaptation from Pathfinder's estimate

We can't start with Pathfinder's inverse metric estimate directly.
Instead we need to first extract its diagonal for a `DiagonalEuclideanMetric` or make it dense for a `DenseEuclideanMetric`:

```@example 1
metric = DenseEuclideanMetric(Matrix(inv_metric))
hamiltonian = Hamiltonian(metric, logp, ForwardDiff)
ϵ = find_good_stepsize(hamiltonian, init_params)
integrator = Leapfrog(ϵ)
proposal = NUTS{MultinomialTS,GeneralisedNoUTurn}(integrator)
adaptor = StepSizeAdaptor(0.8, integrator)
samples_ahmc2, stats_ahmc2 = sample(
hamiltonian,
proposal,
init_params,
ndraws + nadapts,
adaptor,
nadapts;
drop_warmup=true,
progress=false,
)
```

### Use Pathfinder's metric estimate for sampling

To use Pathfinder's metric with no metric adaptation, we need to use Pathfinder's own `RankUpdateEuclideanMetric` type, which just wraps our inverse metric estimate for use with AdvancedHMC:

```@example 1
nadapts = 75
metric = Pathfinder.RankUpdateEuclideanMetric(inv_metric)
hamiltonian = Hamiltonian(metric, logp, ForwardDiff)
ϵ = find_good_stepsize(hamiltonian, init_params)
integrator = Leapfrog(ϵ)
proposal = NUTS{MultinomialTS,GeneralisedNoUTurn}(integrator)
adaptor = StepSizeAdaptor(0.8, integrator)
samples_ahmc3, stats_ahmc3 = sample(
hamiltonian,
proposal,
init_params,
ndraws + nadapts,
adaptor,
nadapts;
drop_warmup=true,
progress=false,
)
```

[^1]: https://mc-stan.org/docs/reference-manual/hmc-algorithm-parameters.html
Loading

2 comments on commit 8c2d00e

@sethaxen
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JuliaRegistrator
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Registration pull request created: JuliaRegistries/General/59621

After the above pull request is merged, it is recommended that a tag is created on this repository for the registered package version.

This will be done automatically if the Julia TagBot GitHub Action is installed, or can be done manually through the github interface, or via:

git tag -a v0.4.2 -m "<description of version>" 8c2d00e2ea3612db814390fbddcb2f915d360156
git push origin v0.4.2

Please sign in to comment.