fix missing data issue #119

mattansb · 2023-03-26T11:26:06Z

This is a WIP

mattansb · 2023-03-27T06:21:53Z

Hi @rvlenth,
Thanks for your help on the issue of dealing with missing data. I have taken your advice and solved this by allowing the user to pass a data= argument with non-missing data, which I deal with in recover_data.lavaan().

For ref_grid() and emmeans() this seems to work fine.

However, for emtrends() I am getting estimation problems. Using some debugging, I've found that when emtrends() is called, it called recover_data() twice, but only passes the user's data= argument the first time. I'm assuming this in not intentional?

Thanks!

# remotes::install_github("mattansb/semTools") # install this PR
library(semTools)
library(emmeans)

data("mtcars")
raw_mtcars <- mtcars
mtcars$hp[1] <- NA

model <- " mpg ~ hp + drat + hp:drat "

fit <- sem(model, mtcars, missing = "fiml.x")



(rg <- ref_grid(fit, 
               lavaan.DV = "mpg",
               data = raw_mtcars))
#> 'emmGrid' object with variables:
#>     hp = 146.69
#>     drat = 3.5966

rg@linfct
#>   (Intercept)       hp     drat  hp:drat
#> 1           1 146.6875 3.596563 527.5708






(emM <- emmeans(fit, ~ drat, var = "hp",
                lavaan.DV = "mpg",
                data = raw_mtcars))
#>  drat emmean    SE  df asymp.LCL asymp.UCL
#>   3.6     20 0.614 Inf      18.8      21.2
#> 
#> Confidence level used: 0.95

emM@linfct
#>      (Intercept)       hp     drat  hp:drat
#> [1,]           1 146.6875 3.596563 527.5708






(emT <- emtrends(fit, ~ drat, var = "hp",
                 lavaan.DV = "mpg",
                 data = raw_mtcars))
#>  drat hp.trend SE df asymp.LCL asymp.UCL
#>   3.6   nonEst NA NA        NA        NA
#> 
#> Confidence level used: 0.95

emT@linfct
#>      (Intercept) hp drat hp:drat
#> [1,]           0 NA    0      NA

rvlenth · 2023-03-27T19:02:01Z

I'm not at all sure that it isn't intentional. The first call to ref_grid() includes a hook to return the data, so that we can set up the difference quotients. The second time we call it, we put another hook that bypasses some stuff already done in the first call. I'll have to look at it to see if we need the data the second time.

rvlenth · 2023-03-27T19:17:52Z

I think it is right the way it is. The setup for the first call to ref_grid() includes this code:

    rgargs = list(object = object, ...)
   . . .
    data = do.call("ref_grid", c(rgargs))

So if data is included in the ... in the emtrends() call, it gets passed to ref_grid(). As you can see, the purpose of that first call is to retrieve the data (via a special hook included in rgargs).

The second call to ref_grid() is

bigRG = do.call("ref_grid", c(rgargs, data = data))

where data is the data already retrieved in the first call.

So actually I'm confused by your statement that data is passed the first time and not the second, because what we actually have is data being explicitly passed the second time, and only implicitly passed the first time.

rvlenth · 2023-03-27T23:11:17Z

OK, my bad! It turns out that if rgargs is a list and data is a data frame with variables x and y, then c(rgargs, data = data) is a list with additional elements data.x and data.y. So I put in an additional line of code to add data itself to the list, and confirmed in debug mode that the right stuff is being passed.. You can install from GitHub and see if it works right now.

mattansb · 2023-03-28T05:59:21Z

Hey, this almost fixes the issue.
I now get a new error:

(emT <- emtrends(fit, ~ drat, var = "hp",
                 lavaan.DV = "mpg",
                 data = raw_mtcars))
#> Error in lav_data_full(data = data, group = group, cluster = cluster,  : 
#>   lavaan ERROR: some (observed) variables specified in the model are not found in the dataset: mpg

This is because the data being passed to recover_data() the second time only has the data for the predictors (from the first pass of recover_data()), but lavaan needs the full multivariate/multivariable dataset.

Can we not simply pass the original data= argument the second time as well?

rvlenth · 2023-03-28T15:01:05Z

You can use the addl.vars argument, e.g., addl.vars = "mpg"

rvlenth · 2023-03-28T23:17:12Z

By the way, in your emmeans support code for lavaan, since you need the response variable, I recommend you retrieve its name from the ressponse part of the model formula, and include that as addl.vars in the call to recover_data(). Then you won't have to rely on the user providing that in their call. See the help page for emmeans::recover_data.

patc3 · 2023-06-19T20:45:54Z

Any update on this issue? Has this been added to simsem?

rvlenth · 2023-06-19T21:00:33Z

@patc3 No additional updates from me (emmeans) since my last comment. My repairs to recover_data are in the latest CRAN version and AFAIK, the additional notes (e.g., using addl.vars) will provide access to all the needed variables.

mattansb · 2023-06-20T13:08:59Z

Sorry @patc3 - I haven't found the time to get back to this just yet.

mattansb added 2 commits March 26, 2023 14:24

fix missing data issue

3f9960a

final fix

6c2979e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix missing data issue #119

fix missing data issue #119

mattansb commented Mar 26, 2023 •

edited

Loading

mattansb commented Mar 27, 2023 •

edited

Loading

rvlenth commented Mar 27, 2023

rvlenth commented Mar 27, 2023

rvlenth commented Mar 27, 2023 •

edited

Loading

mattansb commented Mar 28, 2023

rvlenth commented Mar 28, 2023 •

edited

Loading

rvlenth commented Mar 28, 2023 •

edited

Loading

patc3 commented Jun 19, 2023

rvlenth commented Jun 19, 2023

mattansb commented Jun 20, 2023

fix missing data issue #119

Are you sure you want to change the base?

fix missing data issue #119

Conversation

mattansb commented Mar 26, 2023 • edited Loading

mattansb commented Mar 27, 2023 • edited Loading

rvlenth commented Mar 27, 2023

rvlenth commented Mar 27, 2023

rvlenth commented Mar 27, 2023 • edited Loading

mattansb commented Mar 28, 2023

rvlenth commented Mar 28, 2023 • edited Loading

rvlenth commented Mar 28, 2023 • edited Loading

patc3 commented Jun 19, 2023

rvlenth commented Jun 19, 2023

mattansb commented Jun 20, 2023

mattansb commented Mar 26, 2023 •

edited

Loading

mattansb commented Mar 27, 2023 •

edited

Loading

rvlenth commented Mar 27, 2023 •

edited

Loading

rvlenth commented Mar 28, 2023 •

edited

Loading

rvlenth commented Mar 28, 2023 •

edited

Loading