logged events suggest only one value of a variable may have linear dependence, how to solve this? #610
-
Dear all, I tried to impute missing values by MICE, there was warning message: the logged events suggested that the value of subject_ID59 of the variable subject_ID may have linear dependence problem, so did it mean that only this peticular value have collinear problem? If I don't want to exclude the whole variable subject_ID using predictor matrix, what should I do? Many thanks. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 3 replies
-
You didn't say how you run your model, but did I suspect your variable library(mice, warn.conflicts = FALSE)
# show problem
df <- mice::nhanes
df$subjid <- rownames(df)
imp1 <- mice(df, print = FALSE, seed = 1)
#> Warning: Number of logged events: 99
head(imp1$loggedEvents)
#> it im dep meth
#> 1 0 0 constant
#> 2 1 1 bmi pmm
#> 3 1 1 bmi pmm
#> 4 1 1 hyp pmm
#> 5 1 1 hyp pmm
#> 6 1 1 chl pmm
#> out
#> 1 subjid
#> 2 df set to 1. # observed cases: 16 # predictors: 28
#> 3 age, hyp, subjid10, subjid11, subjid12, subjid13, subjid16, subjid21, subjid3, subjid4, subjid5, subjid6
#> 4 df set to 1. # observed cases: 17 # predictors: 28
#> 5 age, bmi, chl, subjid10, subjid11, subjid12, subjid16, subjid2, subjid21, subjid4, subjid6
#> 6 df set to 1. # observed cases: 15 # predictors: 28
# repair problem
pred <- make.predictorMatrix(df)
pred[, "subjid"] <- 0
imp2 <- mice(df, pred = pred, print = FALSE, seed = 1)
head(imp2$loggedEvents)
#> NULL Created on 2023-12-27 with reprex v2.0.2 |
Beta Was this translation helpful? Give feedback.
You didn't say how you run your model, but did I suspect your variable
subject_ID
has classcharacter
orfactor
. In that casemice
expands it into dummy variables. My advice is to keepsubject_ID
out of the imputation model. See below for a reprex and solution.