-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
General improvements and fixes #133
Merged
Merged
Changes from all commits
Commits
Show all changes
52 commits
Select commit
Hold shift + click to select a range
4d48a35
additional deps and test deps
torfjelde 2df4a2b
updated stepping code to use AbstractSwapStrategy and made TemperedSa…
torfjelde 83c5c49
made TemperedSampler concretely typed and fixed soem docs
torfjelde 9e9153c
introduced AbstractSwapStrategy and removed get_params and make_tempe…
torfjelde 26f12f1
updated stepping.jl
torfjelde aa88ae4
added docstring for make_tempered_model
torfjelde 2523a9d
updated adaptation.jl and made structs concrete
torfjelde 7462ebf
updated ladders.jl
torfjelde 3fb13e9
addressed some comments
torfjelde 90e60f2
added tests
torfjelde e30718d
fixed a bug
torfjelde 9e8607a
updated the StateHistoryCallback a bit
torfjelde 3bb79df
made the distinction between chains and processes clearer
torfjelde 90722c9
added tests
torfjelde 50ad1ec
fixed incorrect statement
torfjelde 0c204ac
renamed some fields to be more descriptive and fixed left-over bug
torfjelde e2bbc90
updated tests
torfjelde e7466cc
removed some show from tests
torfjelde 0a59517
began updating docstrings
torfjelde f7a7f31
fixed docstring for TemeperedState
torfjelde 696d8d1
fix exports
torfjelde bbb2fc2
a bunch of renaming
torfjelde 8ac7374
deleted plotting functionality
torfjelde f7c46e7
fixed bug and added should_swap method
torfjelde dcb2a0d
improved tests
torfjelde 287b501
Typo
ParadaCarleton 6239710
Typo
ParadaCarleton 8c1b8ff
implemented adaptation scheme for inverse temperatures using a geomet…
torfjelde f70690f
made some changes to some code that I cannot understand the original …
torfjelde afa2900
added parameter for controlling which type of schedule to use when ad…
torfjelde c0d9e61
make number of steps taken for each adaptor part of their state
torfjelde 4563d33
improvements to parameterization of the adaptation techniques
torfjelde 86fbf0d
updated test env
torfjelde 229059b
keep track of swapping ratios
torfjelde c4583f9
tests are now runnable
torfjelde a1efd11
commented out unused code
torfjelde c221572
Corrected typo
HarrisonWilde 632cca9
Added 1D GMM, sort of works for it
HarrisonWilde f30acfa
Make `StandardSwap` the default strategy when one
HarrisonWilde 0a0b131
Fixing test case for GMM
HarrisonWilde 910dc6d
Implementing burn-in, introduces depedency on StatsBase
HarrisonWilde a4437f9
Fixed error with burnin
HarrisonWilde b9c6a90
cleaning up working_code
HarrisonWilde 79cbbf0
QoL improvements on the code
HarrisonWilde 4336516
Removing `StatsBase` dependency and `discard_initial` override
HarrisonWilde 915b610
Adding back accidentally deleted RandomPermutationSwap stuff
HarrisonWilde 4898193
cleaned up testing a bit
torfjelde 13cb3b2
made the compute_tempered_logdensities a bit more general
torfjelde cb6937e
Implementing `tempered_sample` to allow for no-swap burn-in and easie…
HarrisonWilde db4b842
Tweaking sample call
HarrisonWilde fce26aa
Working burnin
HarrisonWilde 8a6b47e
Merge pull request #137 from TuringLang/harry/improvements_additions
yebai File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,3 @@ | ||
/Manifest.toml | ||
Manifest.toml | ||
*.DS_Store | ||
*.png | ||
deprecated | ||
working_code |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,54 +1,240 @@ | ||
using Distributions: StatsFuns | ||
|
||
struct PolynomialStep | ||
η :: Real | ||
c :: Real | ||
@concrete struct PolynomialStep | ||
η | ||
c | ||
end | ||
function get(step::PolynomialStep, k::Real) | ||
step.c * (k + 1.) ^ (-step.η) | ||
return step.c * (k + 1) ^ (-step.η) | ||
end | ||
|
||
""" | ||
Geometric | ||
|
||
struct AdaptiveState | ||
swap_target_ar :: Real | ||
scale :: Base.RefValue{<:Real} | ||
step :: PolynomialStep | ||
Specifies a geometric schedule for the inverse temperatures. | ||
|
||
See also: [`AdaptiveState`](@ref), [`update_inverse_temperatures`](@ref), and | ||
[`weight`](@ref). | ||
""" | ||
struct Geometric end | ||
|
||
defaultscale(::Geometric, Δ) = eltype(Δ)(0.9) | ||
|
||
""" | ||
InverselyAdditive | ||
|
||
Specifies an additive schedule for the temperatures (not _inverse_ temperatures). | ||
|
||
See also: [`AdaptiveState`](@ref), [`update_inverse_temperatures`](@ref), and | ||
[`weight`](@ref). | ||
""" | ||
struct InverselyAdditive end | ||
|
||
defaultscale(::InverselyAdditive, Δ) = eltype(Δ)(0.9) | ||
|
||
struct AdaptiveState{S,T1<:Real,T2<:Real,P<:PolynomialStep} | ||
schedule_type::S | ||
swap_target_ar::T1 | ||
scale_unconstrained::T2 | ||
step::P | ||
n::Int | ||
end | ||
function AdaptiveState(swap_target::Real, scale::Real, step::PolynomialStep) | ||
AdaptiveState(swap_target, Ref(log(scale)), step) | ||
|
||
function AdaptiveState(swap_target_ar, scale_unconstrained, step) | ||
return AdaptiveState(InverselyAdditive(), swap_target_ar, scale_unconstrained, step) | ||
end | ||
|
||
function AdaptiveState(schedule_type, swap_target_ar, scale_unconstrained, step) | ||
return AdaptiveState(schedule_type, swap_target_ar, scale_unconstrained, step, 1) | ||
end | ||
|
||
""" | ||
weight(ρ::AdaptiveState{<:Geometric}) | ||
|
||
Return the weight/scale to be used in the mapping `β[ℓ] ↦ β[ℓ + 1]`. | ||
|
||
# Notes | ||
In Eq. (13) in [^MIAS12] they use the relation | ||
|
||
β[ℓ + 1] = β[ℓ] * w(ρ) | ||
|
||
with | ||
|
||
w(ρ) = exp(-exp(ρ)) | ||
|
||
because we want `w(ρ) ∈ (0, 1)` while `ρ ∈ ℝ`. As an alternative, we use | ||
`StatsFuns.logistic(ρ)` which is numerically more stable than `exp(-exp(ρ))` and | ||
leads to less extreme values, i.e. 0 or 1. | ||
|
||
This the same approach as mentioned in [^ATCH11]. | ||
|
||
# References | ||
[^MIAS12]: Miasojedow, B., Moulines, E., & Vihola, M., Adaptive Parallel Tempering Algorithm, (2012). | ||
[^ATCH11]: Atchade, Yves F, Roberts, G. O., & Rosenthal, J. S., Towards optimal scaling of metropolis-coupled markov chain monte carlo, Statistics and Computing, 21(4), 555–568 (2011). | ||
""" | ||
weight(ρ::AdaptiveState{<:Geometric}) = geometric_weight_constrain(ρ.scale_unconstrained) | ||
geometric_weight_constrain(x) = StatsFuns.logistic(x) | ||
geometric_weight_unconstrain(y) = inverse(StatsFuns.logistic)(y) | ||
|
||
""" | ||
weight(ρ::AdaptiveState{<:InverselyAdditive}) | ||
|
||
Return the weight/scale to be used in the mapping `β[ℓ] ↦ β[ℓ + 1]`. | ||
""" | ||
weight(ρ::AdaptiveState{<:InverselyAdditive}) = inversely_additive_weight_constrain(ρ.scale_unconstrained) | ||
inversely_additive_weight_constrain(x) = exp(-x) | ||
inversely_additive_weight_unconstrain(y) = -log(y) | ||
|
||
function init_adaptation( | ||
schedule::InverselyAdditive, | ||
Δ::Vector{<:Real}, | ||
swap_target::Real, | ||
scale::Real, | ||
η::Real, | ||
stepsize::Real | ||
) | ||
N_it = length(Δ) | ||
step = PolynomialStep(η, stepsize) | ||
# TODO: One common state or one per temperature? | ||
# ρs = [ | ||
# AdaptiveState(schedule, swap_target, inversely_additive_weight_unconstrain(scale), step) | ||
# for _ in 1:(N_it - 1) | ||
# ] | ||
ρs = AdaptiveState(schedule, swap_target, log(scale), step) | ||
return ρs | ||
end | ||
|
||
function init_adaptation( | ||
schedule::Geometric, | ||
Δ::Vector{<:Real}, | ||
swap_target::Real, | ||
scale::Real, | ||
γ::Real | ||
η::Real, | ||
stepsize::Real | ||
) | ||
Nt = length(Δ) | ||
step = PolynomialStep(γ, Nt - 1) | ||
Ρ = [AdaptiveState(swap_target, scale, step) for _ in 1:(Nt - 1)] | ||
return Ρ | ||
N_it = length(Δ) | ||
step = PolynomialStep(η, stepsize) | ||
# TODO: One common state or one per temperature? | ||
# ρs = [ | ||
# AdaptiveState(schedule, swap_target, geometric_weight_unconstrain(scale), step) | ||
# for _ in 1:(N_it - 1) | ||
# ] | ||
ρs = AdaptiveState(schedule, swap_target, geometric_weight_unconstrain(scale), step) | ||
return ρs | ||
end | ||
|
||
|
||
function rhos_to_ladder(Ρ, Δ) | ||
β′ = Δ[1] | ||
for i in 1:length(Ρ) | ||
β′ += exp(Ρ[i].scale[]) | ||
Δ[i + 1] = Δ[1] / β′ | ||
""" | ||
adapt!!(ρ::AdaptiveState, swap_ar) | ||
|
||
Return updated `ρ` based on swap acceptance ratio `swap_ar`. | ||
|
||
See [`update_inverse_temperatures`](@ref) to see how we compute the resulting | ||
inverse temperatures from the adapted state `ρ`. | ||
""" | ||
function adapt!!(ρ::AdaptiveState, swap_ar) | ||
swap_diff = ρ.swap_target_ar - swap_ar | ||
γ = get(ρ.step, ρ.n) | ||
ρ_new = @set ρ.scale_unconstrained = ρ.scale_unconstrained + γ * swap_diff | ||
return @set ρ_new.n += 1 | ||
end | ||
|
||
""" | ||
adapt!!(ρ::AdaptiveState, Δ, k, swap_ar) | ||
adapt!!(ρ::AbstractVector{<:AdaptiveState}, Δ, k, swap_ar) | ||
|
||
Return adapted state(s) given that we just proposed a swap of the `k`-th | ||
and `(k + 1)`-th temperatures with acceptance ratio `swap_ar`. | ||
""" | ||
adapt!!(ρ::AdaptiveState, Δ, k, swap_ar) = adapt!!(ρ, swap_ar) | ||
function adapt!!(ρs::AbstractVector{<:AdaptiveState}, Δ, k, swap_ar) | ||
ρs[k] = adapt!!(ρs[k], swap_ar) | ||
return ρs | ||
end | ||
|
||
""" | ||
update_inverse_temperatures(ρ::AdaptiveState{<:Geometric}, Δ_current) | ||
update_inverse_temperatures(ρ::AbstractVector{<:AdaptiveState{<:Geometric}}, Δ_current) | ||
|
||
Return updated inverse temperatures computed from adaptation state(s) and `Δ_current`. | ||
|
||
This update is similar to Eq. (13) in [^MIAS12], with the only possible deviation | ||
being how we compute the scaling factor from `ρ`: see [`weight`](@ref) for information. | ||
|
||
If `ρ` is a `AbstractVector`, then it should be of length `length(Δ_current) - 1`, | ||
with `ρ[k]` corresponding to the adaptation state for the `k`-th inverse temperature. | ||
|
||
# References | ||
[^MIAS12]: Miasojedow, B., Moulines, E., & Vihola, M., Adaptive Parallel Tempering Algorithm, (2012). | ||
""" | ||
function update_inverse_temperatures(ρ::AdaptiveState{<:Geometric}, Δ_current) | ||
Δ = similar(Δ_current) | ||
β₀ = Δ_current[1] | ||
Δ[1] = β₀ | ||
|
||
β = inv(β₀) | ||
for ℓ in 1:length(Δ) - 1 | ||
# TODO: Is it worth it to do this on log-scale instead? | ||
β *= weight(ρ) | ||
@inbounds Δ[ℓ + 1] = β | ||
end | ||
return Δ | ||
end | ||
|
||
function update_inverse_temperatures(ρs::AbstractVector{<:AdaptiveState{<:Geometric}}, Δ_current) | ||
Δ = similar(Δ_current) | ||
N = length(Δ) | ||
@assert length(ρs) ≥ N - 1 "number of adaptive states < number of temperatures" | ||
|
||
β₀ = Δ_current[1] | ||
Δ[1] = β₀ | ||
|
||
function adapt_rho(ρ::AdaptiveState, swap_ar, n) | ||
swap_diff = swap_ar - ρ.swap_target_ar | ||
γ = get(ρ.step, n) | ||
return γ * swap_diff | ||
β = β₀ | ||
for ℓ in 1:N - 1 | ||
# TODO: Is it worth it to do this on log-scale instead? | ||
β *= weight(ρs[ℓ]) | ||
@inbounds Δ[ℓ + 1] = β | ||
end | ||
return Δ | ||
end | ||
|
||
""" | ||
update_inverse_temperatures(ρ::AdaptiveState{<:InverselyAdditive}, Δ_current) | ||
update_inverse_temperatures(ρ::AbstractVector{<:AdaptiveState{<:InverselyAdditive}}, Δ_current) | ||
|
||
Return updated inverse temperatures computed from adaptation state(s) and `Δ_current`. | ||
|
||
This update increments the temperature (not _inverse_ temperature) by a positive constant, | ||
which is adapted by `ρ`. | ||
|
||
If `ρ` is a `AbstractVector`, then it should be of length `length(Δ_current) - 1`, | ||
with `ρ[k]` corresponding to the adaptation state for the `k`-th inverse temperature. | ||
""" | ||
function update_inverse_temperatures(ρ::AdaptiveState{<:InverselyAdditive}, Δ_current) | ||
Δ = similar(Δ_current) | ||
β₀ = Δ_current[1] | ||
Δ[1] = β₀ | ||
|
||
function adapt_ladder(Ρ, Δ, k, swap_ar, n) | ||
Ρ[k].scale[] += adapt_rho(Ρ[k], swap_ar, n) | ||
return Ρ, rhos_to_ladder(Ρ, Δ) | ||
end | ||
T = inv(β₀) | ||
for ℓ in 1:length(Δ) - 1 | ||
T += weight(ρ) | ||
@inbounds Δ[ℓ + 1] = inv(T) | ||
end | ||
return Δ | ||
end | ||
|
||
function update_inverse_temperatures(ρs::AbstractVector{<:AdaptiveState{<:InverselyAdditive}}, Δ_current) | ||
Δ = similar(Δ_current) | ||
N = length(Δ) | ||
@assert length(ρs) ≥ N - 1 "number of adaptive states < number of temperatures" | ||
|
||
β₀ = Δ_current[1] | ||
Δ[1] = β₀ | ||
|
||
T = inv(β₀) | ||
for ℓ in 1:N - 1 | ||
T += weight(ρs[ℓ]) | ||
@inbounds Δ[ℓ + 1] = inv(T) | ||
end | ||
return Δ | ||
end |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It actually seems like the
logistic
approach is also the one taken in 1 too: see Section 2.2.Footnotes
Atchade, Yves F, Roberts, G. O., & Rosenthal, J. S., Towards optimal scaling of metropolis-coupled markov chain monte carlo, Statistics and Computing, 21(4), 555–568 (2011). ↩