predict() from a Prior-sampled chain returns no predictions #2012

DominiqueMakowski · 2023-06-17T09:06:52Z

Apparently Turing now has a built-in predict function, which is great. Unfortunately, when I run it on a prior-sampled chains object (to do a prior predictive check), it seems to return no predictions:

using Turing
using DataFrames

# Regression model example
@model function lm(y, x)
    # Priors
    σ² ~ InverseGamma(2, 3)  # Sigma
    intercept ~ Normal(0, sqrt(3))  # Intercept
    β ~ TDist(3)

    # Calculate all the mu terms.
    mu = intercept .+ x * β

    # Likelihood
    return y ~ Normal(mu, σ²)
end

chain = sample(lm(0, 0), Prior(), 50000)
pred = predict(lm(0, 0), chain)

Summary Statistics
  parameters   mean   std      mcse   ess_bulk   ess_tail      rhat   ess_per_sec 
      Symbol    Any   Any   Float64    Float64    Float64   Float64       Missing


Quantiles
  parameters   2.5%   25.0%   50.0%   75.0%   97.5% 
      Symbol    Any     Any     Any     Any     Any

DataFrame(pred)

50000×2 DataFrame
   Row │ iteration  chain 
       │ Int64      Int64
───────┼──────────────────
     1 │         1      1
     2 │         2      1
     3 │         3      1
     4 │         4      1
     5 │         5      1
     6 │         6      1
   ⋮   │     ⋮        ⋮
 49995 │     49995      1
 49996 │     49996      1
 49997 │     49997      1
 49998 │     49998      1
 49999 │     49999      1
 50000 │     50000      1
        49988 rows omitted

Is there something I am missing?

The text was updated successfully, but these errors were encountered:

torfjelde · 2023-06-17T11:51:17Z

In your case you can do

pred = predict(lm(missing, 0), chain)

The current implementation of predict assumes you've effectively un-conditioned the data-variables. The process effectively goes:

Run inference conditioned on y to get your posterior (represented by chain).
You now de-condition y so that it is treated as a random variable.
Using the posterior, i.e. chain, you sample y.

predict does Step 3 for you, but you have to do the de-conditioning yourself (Step 2).

The reason why we've done it this way is because it's completely valid for you to for example do

model = lm(0,0)
model_with_intercept_fixed = model | (intercept = 10.0, )

In this scenario, the conditioned variables are both y and intercept (variables can be conditioned using either arguments or |) and so we predict automatically sampled all variables that are conditioned, then it would both sample y and intercept.

It might make sense to add some functionality to allow the user to fix a variable, e.g. the intercept, while also specifying that it's not to be considered an observation. Once we have that, then we could add a predict with the behavior you expect:)

DominiqueMakowski · 2023-06-17T13:40:41Z

Interesting, thanks for the clarification!

DominiqueMakowski closed this as completed Jun 17, 2023

DominiqueMakowski mentioned this issue Jun 17, 2023

How to predict from prior-sampled model (+ how to have choice and rt as a separate input) itsdfish/SequentialSamplingModels.jl#19

Closed

torfjelde mentioned this issue Jun 17, 2023

Add fix and unfix TuringLang/DynamicPPL.jl#488

Merged

kiante-fernandez mentioned this issue Jun 21, 2023

Define MixedMultivariateDistribution type itsdfish/SequentialSamplingModels.jl#27

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

predict() from a Prior-sampled chain returns no predictions #2012

predict() from a Prior-sampled chain returns no predictions #2012

DominiqueMakowski commented Jun 17, 2023

torfjelde commented Jun 17, 2023

DominiqueMakowski commented Jun 17, 2023

predict() from a Prior-sampled chain returns no predictions #2012

predict() from a Prior-sampled chain returns no predictions #2012

Comments

DominiqueMakowski commented Jun 17, 2023

torfjelde commented Jun 17, 2023

DominiqueMakowski commented Jun 17, 2023