Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

predict() from a Prior-sampled chain returns no predictions #2012

Closed
DominiqueMakowski opened this issue Jun 17, 2023 · 2 comments
Closed

predict() from a Prior-sampled chain returns no predictions #2012

DominiqueMakowski opened this issue Jun 17, 2023 · 2 comments

Comments

@DominiqueMakowski
Copy link
Contributor

Apparently Turing now has a built-in predict function, which is great. Unfortunately, when I run it on a prior-sampled chains object (to do a prior predictive check), it seems to return no predictions:

using Turing
using DataFrames

# Regression model example
@model function lm(y, x)
    # Priors
    σ² ~ InverseGamma(2, 3)  # Sigma
    intercept ~ Normal(0, sqrt(3))  # Intercept
    β ~ TDist(3)

    # Calculate all the mu terms.
    mu = intercept .+ x * β

    # Likelihood
    return y ~ Normal(mu, σ²)
end

chain = sample(lm(0, 0), Prior(), 50000)
pred = predict(lm(0, 0), chain)
Summary Statistics
  parameters   mean   std      mcse   ess_bulk   ess_tail      rhat   ess_per_sec 
      Symbol    Any   Any   Float64    Float64    Float64   Float64       Missing


Quantiles
  parameters   2.5%   25.0%   50.0%   75.0%   97.5% 
      Symbol    Any     Any     Any     Any     Any
DataFrame(pred)
50000×2 DataFrame
   Row │ iteration  chain 
       │ Int64      Int64
───────┼──────────────────
     1 │         1      1
     2 │         2      1
     3 │         3      1
     4 │         4      1
     5 │         5      1
     6 │         6      1
   ⋮   │     ⋮        ⋮
 49995 │     49995      1
 49996 │     49996      1
 49997 │     49997      1
 49998 │     49998      1
 49999 │     49999      1
 50000 │     50000      1
        49988 rows omitted

Is there something I am missing?

@torfjelde
Copy link
Member

In your case you can do

pred = predict(lm(missing, 0), chain)

The current implementation of predict assumes you've effectively un-conditioned the data-variables. The process effectively goes:

  1. Run inference conditioned on y to get your posterior (represented by chain).
  2. You now de-condition y so that it is treated as a random variable.
  3. Using the posterior, i.e. chain, you sample y.

predict does Step 3 for you, but you have to do the de-conditioning yourself (Step 2).

The reason why we've done it this way is because it's completely valid for you to for example do

model = lm(0,0)
model_with_intercept_fixed = model | (intercept = 10.0, )

In this scenario, the conditioned variables are both y and intercept (variables can be conditioned using either arguments or |) and so we predict automatically sampled all variables that are conditioned, then it would both sample y and intercept.

It might make sense to add some functionality to allow the user to fix a variable, e.g. the intercept, while also specifying that it's not to be considered an observation. Once we have that, then we could add a predict with the behavior you expect:)

@DominiqueMakowski
Copy link
Contributor Author

Interesting, thanks for the clarification!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants