-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question: making LBA compatible with new interface #95
Comments
If this is from the Statistical Rethinking book, can you please tell me where to find it? |
This is actually a different model that Rob and I are using for MCMCBenchmarks. |
Thanks. I think the issue is with coding the log-likelihood in a numerically robust way, I will skim through the paper to understand the model and get back to you about this. |
Thanks. Something like that might help. Since the model works in Stan fairly well, I was wondering whether adopting Stan's NUTS configuration might work too. In fact, having that as a pre-set configuration (e.g. setting something to |
Not sure if this helps, but on my system (OSX), with Random.seed!(123) - a lucky setting - I do get a consistent result. I took out the nptochain() call (undefined in this MWE). I did notice 37% at depth 10.
The results look like shown below. First line is result of DynamicHMC.Diagnostics.EBFMI(results.tree_statistics).
```
1.2595743249482145
Hamiltonian Monte Carlo sample of length 2000
acceptance rate mean: 0.84, 5/25/50/75/95%: 0.39 0.76 0.94 0.99 1.0
termination: divergence => 17%, max_depth => 37%, turning => 46%
depth: 0 => 0%, 1 => 0%, 2 => 1%, 3 => 2%, 4 => 2%, 5 => 7%, 6 => 15%, 7 => 15%, 8 => 12%, 9 => 9%, 10 => 37%
2-element Array{ChainDataFrame,1}
Summary Statistics
. Omitted printing of 1 columns
│ Row │ parameters │ mean │ std │ naive_se │ mcse │ ess │
│ │ Symbol │ Float64 │ Float64 │ Float64 │ Float64 │ Any │
├─────┼────────────┼──────────┼───────────┼────────────┼────────────┼─────────┤
│ 1 │ A │ 0.290358 │ 0.161938 │ 0.00362104 │ 0.00519409 │ 1339.27 │
│ 2 │ k │ 0.178341 │ 0.0859475 │ 0.00192184 │ 0.00210283 │ 1122.18 │
│ 3 │ tau │ 0.44426 │ 0.0471096 │ 0.0010534 │ 0.00151385 │ 1005.54 │
│ 4 │ v[1] │ 0.290358 │ 0.161938 │ 0.00362104 │ 0.00519409 │ 1339.27 │
│ 5 │ v[2] │ 0.178341 │ 0.0859475 │ 0.00192184 │ 0.00210283 │ 1122.18 │
│ 6 │ v[3] │ 0.44426 │ 0.0471096 │ 0.0010534 │ 0.00151385 │ 1005.54 │
Quantiles
│ Row │ parameters │ 2.5% │ 25.0% │ 50.0% │ 75.0% │ 97.5% │
│ │ Symbol │ Float64 │ Float64 │ Float64 │ Float64 │ Float64 │
├─────┼────────────┼───────────┼──────────┼──────────┼──────────┼──────────┤
│ 1 │ A │ 0.0599009 │ 0.172293 │ 0.261843 │ 0.375416 │ 0.684729 │
│ 2 │ k │ 0.0484054 │ 0.112492 │ 0.167503 │ 0.230146 │ 0.374664 │
│ 3 │ tau │ 0.340562 │ 0.414034 │ 0.450612 │ 0.479162 │ 0.516927 │
│ 4 │ v[1] │ 0.0599009 │ 0.172293 │ 0.261843 │ 0.375416 │ 0.684729 │
│ 5 │ v[2] │ 0.0484054 │ 0.112492 │ 0.167503 │ 0.230146 │ 0.374664 │
│ 6 │ v[3] │ 0.340562 │ 0.414034 │ 0.450612 │ 0.479162 │ 0.516927 │
```
The slightly adapted script is below. Just added Random.seed!(123) at the top, removed to nptochain() call and the conversion to an MCMCChains.Chains object at the bottom:
```
using Distributions, Parameters, DynamicHMC
using LogDensityProblems, TransformVariables
using Random, MCMCChains
import Distributions: pdf,logpdf,rand
export LBA,pdf,logpdf,rand
Random.seed!(123)
mutable struct LBA{T1,T2,T3,T4} <: ContinuousUnivariateDistribution
ν::T1
A::T2
k::T3
τ::T4
σ::Float64
end
Base.broadcastable(x::LBA)=Ref(x)
LBA(;τ,A,k,ν,σ=1.0) = LBA(ν,A,k,τ,σ)
function selectWinner(dt)
if any(x->x >0,dt)
mi,mv = 0,Inf
for (i,t) in enumerate(dt)
if (t > 0) && (t < mv)
mi = i
mv = t
end
end
else
return 1,-1.0
end
return mi,mv
end
function sampleDriftRates(ν,σ)
noPositive=true
v = similar(ν)
while noPositive
v = [rand(Normal(d,σ)) for d in ν]
any(x->x>0,v) ? noPositive=false : nothing
end
return v
end
function rand(d::LBA)
@unpack τ,A,k,ν,σ = d
b=A+k
N = length(ν)
v = sampleDriftRates(ν,σ)
a = rand(Uniform(0,A),N)
dt = @. (b-a)/v
choice,mn = selectWinner(dt)
rt = τ .+ mn
return choice,rt
end
function rand(d::LBA,N::Int)
choice = fill(0,N)
rt = fill(0.0,N)
for i in 1:N
choice[i],rt[i]=rand(d)
end
return (choice=choice,rt=rt)
end
logpdf(d::LBA,choice,rt) = log(pdf(d,choice,rt))
function logpdf(d::LBA,data::T) where {T<:NamedTuple}
return sum(logpdf.(d,data...))
end
function logpdf(dist::LBA,data::Array{<:Tuple,1})
LL = 0.0
for d in data
LL += logpdf(dist,d...)
end
return LL
end
function pdf(d::LBA,c,rt)
@unpack τ,A,k,ν,σ = d
b=A+k; den = 1.0
rt < τ ? (return 1e-10) : nothing
for (i,v) in enumerate(ν)
if c == i
den *= dens(d,v,rt)
else
den *= (1-cummulative(d,v,rt))
end
end
pneg = pnegative(d)
den = den/(1-pneg)
den = max(den,1e-10)
isnan(den) ? (return 0.0) : (return den)
end
logpdf(d::LBA,data::Tuple) = logpdf(d,data...)
function dens(d::LBA,v,rt)
@unpack τ,A,k,ν,σ = d
dt = rt-τ; b=A+k
n1 = (b-A-dt*v)/(dt*σ)
n2 = (b-dt*v)/(dt*σ)
dens = (1/A)*(-v*cdf(Normal(0,1),n1) + σ*pdf(Normal(0,1),n1) +
v*cdf(Normal(0,1),n2) - σ*pdf(Normal(0,1),n2))
return dens
end
function cummulative(d::LBA,v,rt)
@unpack τ,A,k,ν,σ = d
dt = rt-τ; b=A+k
n1 = (b-A-dt*v)/(dt*σ)
n2 = (b-dt*v)/(dt*σ)
cm = 1 + ((b-A-dt*v)/A)*cdf(Normal(0,1),n1) -
((b-dt*v)/A)*cdf(Normal(0,1),n2) + ((dt*σ)/A)*pdf(Normal(0,1),n1) -
((dt*σ)/A)*pdf(Normal(0,1),n2)
return cm
end
function pnegative(d::LBA)
@unpack ν,σ=d
p=1.0
for v in ν
p*= cdf(Normal(0,1),-v/σ)
end
return p
end
struct LBAProb{T}
data::T
N::Int
Nc::Int
end
function (problem::LBAProb)(θ)
@unpack data=problem
@unpack v,A,k,tau=θ
d=LBA(ν=v,A=A,k=k,τ=tau)
minRT = minimum(x->x[2],data)
logpdf(d,data)+sum(logpdf.(TruncatedNormal(0,3,0,Inf),v)) +
logpdf(TruncatedNormal(.8,.4,0,Inf),A)+logpdf(TruncatedNormal(.2,.3,0,Inf),k)+
logpdf(TruncatedNormal(.4,.1,0,minRT),tau)
end
function sampleDHMC(choice,rt,N,Nc,nsamples)
data = [(c,r) for (c,r) in zip(choice,rt)]
return sampleDHMC(data,N,Nc,nsamples)
end
# Define problem with data and inits.
function sampleDHMC(data,N,Nc,nsamples)
p = LBAProb(data,N,Nc)
p((v=fill(.5,Nc),A=.8,k=.2,tau=.4))
# Write a function to return properly dimensioned transformation.
trans = as((v=as(Array,asℝ₊,Nc),A=asℝ₊,k=asℝ₊,tau=asℝ₊))
# Use Flux for the gradient.
P = TransformedLogDensity(trans, p)
∇P = ADgradient(:ForwardDiff, P)
# FSample from the posterior.
n = dimension(trans)
results = mcmc_with_warmup(Random.GLOBAL_RNG, ∇P, nsamples;
#q = zeros(n),
#p = ones(n),
#reporter = NoProgressReport()
)
# Undo the transformation to obtain the posterior from the chain.
posterior = transform.(trans, results.chain)
#chns = nptochain(results,posterior)
return (results, posterior)
end
function simulateLBA(;Nd,v=[1.0,1.5,2.0],A=.8,k=.2,tau=.4,kwargs...)
return (rand(LBA(ν=v,A=A,k=k,τ=tau),Nd)...,N=Nd,Nc=length(v))
end
data = simulateLBA(Nd=10)
(results, posterior) = sampleDHMC(data...,2000)
DynamicHMC.Diagnostics.EBFMI(results.tree_statistics) |> display
DynamicHMC.Diagnostics.summarize_tree_statistics(results.tree_statistics) |> display
parameter_names = ["v[1]", "v[2]", "v[3]", "A", "k", "tau"]
# Create a3d
a3d = Array{Float64, 3}(undef, 2000, 6, 1);
for j in 1:1
for i in 1:2000
a3d[i, 1:3, j] = values(posterior[i].v)
a3d[i, 4, j] = values(posterior[i].A)
a3d[i, 5, j] = values(posterior[i].k)
a3d[i, 6, j] = values(posterior[i].tau)
end
end
chns = MCMCChains.Chains(a3d,
vcat(parameter_names),
Dict(
:parameters => parameter_names
)
)
describe(chns)
```
Rob J Goedman
[email protected]
… On Sep 16, 2019, at 13:35, dfish ***@***.***> wrote:
Thanks. Something like that might help.
Since the model works in Stan fairly well, I was wondering whether adopting Stan's NUTS configuration might work too. In fact, having that as a pre-set configuration (e.g. setting something to Stan_Config) might be helpful (assuming that your settings are still different from Stan).
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub <#95?email_source=notifications&email_token=AAAMX2ECCSJHONWC22WCWRDQJ5VQZA5CNFSM4IW2RZU2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6Y3MAQ#issuecomment-531740162>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAAMX2DEKTQUJCRXNWPKU3TQJ5VQZANCNFSM4IW2RZUQ>.
|
Thanks, Rob. That is further than I got. Roughly every other run ends in an error. When it does run all the way through, the v parameters are off quite a bit, even when I increase the number of data points to 100.
|
Yes, I think I managed to run Stan using the same data and get very different results. |
I am working on this, but want to understand the model first. |
I worked a bit on the code and put it in a repo https://github.com/tpapp/LBA_problem where you can track the changes I did. There seems to be a numerical problem since you are multiplying densities, which over/underflow rather quickly. I made some changes but could not comlete everything since I don't fully understand the model. The first thing I would recommend is that you finish this rewrite and verify that all the formulas are correct. Then you should explore the robustness with |
Thanks. I appreciate your help. I'll get back with you as soon as I know something. |
I am happy to keep this issue open, but please let me know if you need further help, or if the problem is solved now. |
Sorry about that. Rob and I were looking into the issue and he started making progress, but had to put it aside for a while. I'll close the issue for the time being and will reopen if we reach an impasse. |
Hi Tamas- Rob and I hacked away at this problem but still have not came to a full resolution. I looked through your changes to the logpdf and found a minor error, but aside from that, it was good. Although it did not solve the numerical errors, it should reduce under/overflow, particularly in large data sets. Just as a reminder, this is the error message:
After turning off the initial optimization stage, I encountered a domain error which produced the following message:
Here is the up-to-date code to replicate this result. If I understand correctly, given the transformation bounds Thanks for your help! |
Thanks for the heads up, I will look at this. |
Thanks! By the way, I just realized that the transformation on tau might need a finite upper bound:
In either case, I want to check whether the negative values in the domain error are to be expected. |
I looked at your example. First, those negative values are definitely to be expected. Sampling works on the Second, here is how I would recommend debugging this, either in the parameter or the unconstrained space: bad_xs = LogDensityProblems.stresstest(LogDensityProblems.logdensity, P; scale = 0.001)
bad_θs = trans.(bad_xs)
LogDensityProblems.logdensity(P, bad_xs[1])
p(bad_θs[1]) The Again, please do not hesitate to ask for help if you get stuck. |
Hi Tamas-
I was wondering if you might help me update this code for the new interface. While updating my other models was simple, this one seems to be a little trickier because I had to make some changes to the adaptation parameters to make it work. Here is how it used to work.
I tried several potential solutions, including
initialization = (q = zeros(n), κ = GaussianKineticEnergy(5, 0.1))
, but to no avail. Any guidance would be much appreciated. Thanks!The text was updated successfully, but these errors were encountered: