-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cluster-robust standard errors for ordinal data (and WLSMV) #254
Comments
This is possible to specify this on my end. It does not raise a warning or error. |
Yes, I received an error in version 0.6-11, but haven't received an error since version 0.6-12 and higher (reprex below). Has this been fixed, or has the error merely been suppressed? In other words, is lavaan successfully accounting for clustering with ordinal data and WLSMV? devtools::install_version(
"lavaan",
version = "0.6-11",
repos = "http://cran.us.r-project.org",
lib = "C:/R/Old")
#> Downloading package from url: http://cran.us.r-project.org/src/contrib/Archive/lavaan/lavaan_0.6-11.tar.gz
library("lavaan", lib.loc = "C:/R/Old")
#> This is lavaan 0.6-11
#> lavaan is FREE software! Please report any bugs.
nSubjects <- 100
datapointsPerSubject <- 10
rho <- .70
mydata <- data.frame(
ID = rep(1:nSubjects, each = datapointsPerSubject),
predictor = NA,
outcome = NA)
simulateOrdinalData <- function(n, rho){
v1 <- sample(c(0, 1, 2), n, replace = TRUE, prob = c(1/3, 1/3, 1/3))
index <- sample(c(0, 1), n, replace = TRUE, prob = c(1 - rho, rho))
index <- index == 1
v2 <- sample(c(0, 1, 2), n, replace = TRUE, prob = c(1/3, 1/3, 1/3))
v2[index] <- v1[index]
simdata <- data.frame(v1, v2)
return(simdata)
}
for(i in 1:nSubjects){
simdata <- simulateOrdinalData(n = datapointsPerSubject, rho = rho)
mydata[which(mydata$ID == i), c("predictor","outcome")] <- simdata[,c("v1","v2")]
}
mydata$predictor[sample(1:nrow(mydata), size = 15)] <- NA
mydata$outcome[sample(1:nrow(mydata), size = 15)] <- NA
clusteredModel <- '
outcome ~ predictor
'
clusteredModel_fit <- sem(
model = clusteredModel,
data = mydata,
ordered = TRUE,
std.lv = TRUE,
cluster = "ID",
estimator = "WLSMV",
missing = "pairwise"
)
#> Warning in lav_options_set(opt): lavaan WARNING: information will be set to
#> "expected" for estimator = "DWLS"
#> Warning in lav_data_full(data = data, group = group, cluster = cluster, : lavaan WARNING: 15 cases were deleted due to missing values in
#> exogenous variable(s), while fixed.x = TRUE.
#> Error in th.start.idx[i]:th.end.idx[i]: NA/NaN argument
summary(clusteredModel_fit, standardized = TRUE)
#> Error in h(simpleError(msg, call)): error in evaluating the argument 'object' in selecting a method for function 'summary': object 'clusteredModel_fit' not found
sessionInfo()
#> R version 4.2.0 (2022-04-22 ucrt)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 19045)
#>
#> Matrix products: default
#>
#> locale:
#> [1] LC_COLLATE=English_United States.utf8
#> [2] LC_CTYPE=English_United States.utf8
#> [3] LC_MONETARY=English_United States.utf8
#> [4] LC_NUMERIC=C
#> [5] LC_TIME=English_United States.utf8
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] lavaan_0.6-11
#>
#> loaded via a namespace (and not attached):
#> [1] Rcpp_1.0.10 urlchecker_1.0.1 compiler_4.2.0 later_1.3.1
#> [5] remotes_2.4.2 profvis_0.3.8 R.methodsS3_1.8.2 prettyunits_1.1.1
#> [9] R.utils_2.12.2 tools_4.2.0 pkgload_1.3.2 digest_0.6.31
#> [13] pkgbuild_1.4.1 evaluate_0.21 memoise_2.0.1 lifecycle_1.0.3
#> [17] R.cache_0.16.0 rlang_1.1.1 reprex_2.0.2 shiny_1.7.4
#> [21] cli_3.6.1 yaml_2.3.7 pbivnorm_0.6.0 xfun_0.39
#> [25] fastmap_1.1.1 stringr_1.5.0 withr_2.5.0 styler_1.10.1
#> [29] knitr_1.43 htmlwidgets_1.6.2 fs_1.6.2 vctrs_0.6.3
#> [33] devtools_2.4.5 stats4_4.2.0 glue_1.6.2 R6_2.5.1
#> [37] processx_3.8.1 rmarkdown_2.22 sessioninfo_1.2.2 purrr_1.0.1
#> [41] callr_3.7.3 magrittr_2.0.3 usethis_2.2.0 promises_1.2.0.1
#> [45] ps_1.7.5 ellipsis_0.3.2 htmltools_0.5.5 mnormt_2.1.1
#> [49] mime_0.12 xtable_1.8-4 httpuv_1.6.11 stringi_1.7.12
#> [53] miniUI_0.1.1.1 cachem_1.0.8 crayon_1.5.2 R.oo_1.25.0
detach("package:lavaan", unload = TRUE)
install.packages("lavaan")
#> Installing package into 'C:/R/Packages'
#> (as 'lib' is unspecified)
#> package 'lavaan' successfully unpacked and MD5 sums checked
#>
#> The downloaded binary packages are in
#> C:\Users\Isaac\AppData\Local\Temp\Rtmp6BqtjZ\downloaded_packages
library("lavaan")
#> Warning: package 'lavaan' was built under R version 4.2.3
#> This is lavaan 0.6-15
#> lavaan is FREE software! Please report any bugs.
mydata <- data.frame(
ID = rep(1:nSubjects, each = datapointsPerSubject),
predictor = NA,
outcome = NA)
for(i in 1:nSubjects){
simdata <- simulateOrdinalData(n = datapointsPerSubject, rho = rho)
mydata[which(mydata$ID == i), c("predictor","outcome")] <- simdata[,c("v1","v2")]
}
mydata$predictor[sample(1:nrow(mydata), size = 15)] <- NA
mydata$outcome[sample(1:nrow(mydata), size = 15)] <- NA
clusteredModel_fit <- sem(
model = clusteredModel,
data = mydata,
ordered = TRUE,
cluster = "ID",
estimator = "WLSMV",
missing = "pairwise"
)
#> Warning in lav_data_full(data = data, group = group, cluster = cluster, : lavaan WARNING: 15 cases were deleted due to missing values in
#> exogenous variable(s), while fixed.x = TRUE.
#> Warning in lav_data_full(data = data, group = group, cluster = cluster, : lavaan WARNING: some cases are empty and will be ignored:
#> 20 79 200 263 333 384 419 508 551 620 734 792 920 925 960
summary(clusteredModel_fit, standardized = TRUE)
#> lavaan 0.6.15 ended normally after 2 iterations
#>
#> Estimator DWLS
#> Optimization method NLMINB
#> Number of model parameters 3
#>
#> Used Total
#> Number of observations 970 1000
#> Number of clusters [ID] 100
#> Number of missing patterns 1
#>
#> Model Test User Model:
#> Standard Scaled
#> Test Statistic 0.000 0.000
#> Degrees of freedom 0 0
#>
#> Test Statistic 0.000 0.000
#> Degrees of freedom 0 0
#>
#> Parameter Estimates:
#>
#> Standard errors Robust.cluster.sem
#> Information Expected
#> Information saturated (h1) model Unstructured
#>
#> Regressions:
#> Estimate Std.Err z-value P(>|z|) Std.lv Std.all
#> outcome ~
#> predictor 1.411 0.045 31.230 0.000 1.411 0.761
#>
#> Intercepts:
#> Estimate Std.Err z-value P(>|z|) Std.lv Std.all
#> .outcome 0.000 0.000 0.000
#>
#> Thresholds:
#> Estimate Std.Err z-value P(>|z|) Std.lv Std.all
#> outcome|t1 0.735 0.070 10.567 0.000 0.735 0.477
#> outcome|t2 2.054 0.065 31.522 0.000 2.054 1.332
#>
#> Variances:
#> Estimate Std.Err z-value P(>|z|) Std.lv Std.all
#> .outcome 1.000 1.000 0.421
#>
#> Scales y*:
#> Estimate Std.Err z-value P(>|z|) Std.lv Std.all
#> outcome 1.000 1.000 1.000
sessionInfo()
#> R version 4.2.0 (2022-04-22 ucrt)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 19045)
#>
#> Matrix products: default
#>
#> locale:
#> [1] LC_COLLATE=English_United States.utf8
#> [2] LC_CTYPE=English_United States.utf8
#> [3] LC_MONETARY=English_United States.utf8
#> [4] LC_NUMERIC=C
#> [5] LC_TIME=English_United States.utf8
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] lavaan_0.6-15
#>
#> loaded via a namespace (and not attached):
#> [1] styler_1.10.1 xfun_0.39 remotes_2.4.2 purrr_1.0.1
#> [5] vctrs_0.6.3 miniUI_0.1.1.1 htmltools_0.5.5 usethis_2.2.0
#> [9] stats4_4.2.0 yaml_2.3.7 rlang_1.1.1 pkgbuild_1.4.1
#> [13] R.oo_1.25.0 later_1.3.1 urlchecker_1.0.1 glue_1.6.2
#> [17] withr_2.5.0 R.utils_2.12.2 sessioninfo_1.2.2 R.cache_0.16.0
#> [21] lifecycle_1.0.3 stringr_1.5.0 R.methodsS3_1.8.2 devtools_2.4.5
#> [25] htmlwidgets_1.6.2 memoise_2.0.1 evaluate_0.21 knitr_1.43
#> [29] callr_3.7.3 fastmap_1.1.1 httpuv_1.6.11 ps_1.7.5
#> [33] parallel_4.2.0 Rcpp_1.0.10 xtable_1.8-4 promises_1.2.0.1
#> [37] cachem_1.0.8 pkgload_1.3.2 mime_0.12 fs_1.6.2
#> [41] mnormt_2.1.1 digest_0.6.31 stringi_1.7.12 processx_3.8.1
#> [45] shiny_1.7.4 quadprog_1.5-8 cli_3.6.1 tools_4.2.0
#> [49] magrittr_2.0.3 profvis_0.3.8 crayon_1.5.2 pbivnorm_0.6.0
#> [53] ellipsis_0.3.2 prettyunits_1.1.1 reprex_2.0.2 rmarkdown_2.22
#> [57] R6_2.5.1 compiler_4.2.0 Created on 2023-06-17 with reprex v2.0.2 |
Yet that's also the question I feel. Terrence replied on the google group that it is not possible, referring to this issue, suggesting that although a result is returned, it is difficult to interpret. Would be nice to hear from the developers. But they surely don't have all the time in the world to answer questions. |
This is NOT supported by lavaan yet. Unfortunately, no warning/error message was printed (in lavaan <= 0.6-15), but the output is simply ignoring the clustering structure. The github version of lavaan (pre 0.6-16) will produce a hard stop. |
It'd be nice to have the option for cluster-robust standard errors for ordinal data (when using WLSMV). Much of my data are ordinal and nested, so this is the primary issue that prevents me from making a wholesale shift from Mplus to lavaan (and I much prefer R/lavaan).
This topic was discussed in the context of a broader issue on fitting multilevel models with ordinal data (#225). However, I thought it'd be worth separating the two issues because (a) cluster-robust SEs for ordinal data is a much more focused issue, and (b) based on the discussion in that thread, it sounds like multilevel models with ordinal data won't be implemented in lavaan for a long time (if ever), so I didn't want cluster-robust SEs for ordinal data to get lost in the shuffle. Let me know if you think this issue is too redundant and that I should close it.
In case it's helpful for implementation, I provide an Mplus code example for cluster-robust standard errors for ordinal data here:
#225 (comment)
FWIW, in that thread, Yves noted (#225 (comment)): "I have not found papers/literature about this yet, but I do have some ideas how this can be done: when we compute the 'cluster robust' standard errors, we must recompute the 'WLS.W' matrix (see lav_muthen1984) taking the clustering into account, in order to compute a cluster-robust 'Gamma' matrix, which is then used in lav_model_nvcov_robust_sem(). At least, that is my current hypothesis."
The text was updated successfully, but these errors were encountered: