Does 2l.pmm bias correlations downwards? #464

skramer1958 · 2022-01-24T16:30:28Z

skramer1958
Jan 24, 2022

I'm trying to impute data on a multilevel data set (students in classrooms in schools). I'm using 2l.pmm to account for school-level correlations, since school was the unit of assignment in my experimental study.

However, I notice that the 2l.pmm method tends to impute variables with much lower correlations than were in the original data set. The problem gets worse the more variables I include in the multiple imputation.
For example, students took three science unit tests over two years, two in sixth grade (units called DL and WW) and one in seventh grade (unit called EH). There is lots of missing data for each test. Here are the original correlations for a subset of the data (control group in cohort 1 in one particular state):
Correlation between test DL and test WW : 0.555
Correlation between DL and EH: 0.534
Correlation between WW and EH: 0.520

Using a large data set and a bunch of other relevant variables many of which had missing data (race, gender, disadvantaged status, school mean on minority and disadvantaged, fourth and fifth grade math and reading scores, classroom averages on most of these variables), I imputed a data set using pmm. The first imputation gives a good idea of the results. Here were the correlations on the first imputation:
Correlation between test DL and test WW : 0.547
Correlation between DL and EH: 0.500
Correlation between WW and EH: 0.542

But when I used 2l.pmm to compute these three and other variables with missing data (school as cluster variable) the correlations were much lower.
Correlation between test DL and test WW : 0.421
Correlation between DL and EH: 0.331
Correlation between WW and EH: 0.250
Now it is reasonable that correlations might be a bit attenuated in the full data set, if students with missing data for example tend to be lower scorers who have lower inter-correlations. But the results above don't seem to meet the "sniff test". The attenuation is too much. Plus, these are results I obtained after dropping a bunch of variables from the imputation model, because the more variables I add the lower the correlations get.

Note that these results are for a "subset of the data", but I get the same thing on all of the subsets. If I don't subset and instead include interactions in the model, the correlations get even lower for 2l.pmm. Meanwhile, the pmm approach continues to reproduce correlations very near those in the original data set.

How can I know if 2l.pmm is worth using, i.e., diagnose whether the imputed data sets are reasonable; and whether the 2-level imputation is worse (more biased) than single-level imputation or perhaps even worse (more biased) than listwise deleting missing data?

hanneoberman · 2022-01-24T17:51:51Z

hanneoberman
Jan 24, 2022
Maintainer

Hi @skramer1958, would it be possible to post a reproducible example for this issue? See e.g. https://reprex.tidyverse.org/articles/articles/learn-reprex.html. FYI, another useful resource is the mice vignette about imputing multilevel data: https://www.gerkovink.com/miceVignettes/Multi_level/Multi_level_data.html.

0 replies

skramer1958 · 2022-01-24T18:07:00Z

skramer1958
Jan 24, 2022
Author

Hanne, This may be a foolish question, but the webinar https://reprex.tidyverse.org/articles/articles/learn-reprex.html is using something called R Studio, which is not the same as R. Do I need to do tutorials on R studio and download and start using that before I can begin making a reproducible example? Steve From: Hanne Oberman ***@***.***> Sent: Monday, January 24, 2022 12:52 PM To: amices/mice ***@***.***> Cc: Steven Kramer ***@***.***>; Mention ***@***.***> Subject: Re: [amices/mice] Does 2l.pmm bias correlations downwards? (Issue #462) Hi @skramer1958<https://github.com/skramer1958>, would it be possible to post a reproducible example for this issue? See e.g. https://reprex.tidyverse.org/articles/articles/learn-reprex.html. FYI, another useful resource is the mice vignette about imputing multilevel data: https://www.gerkovink.com/miceVignettes/Multi_level/Multi_level_data.html. — Reply to this email directly, view it on GitHub<#462 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AXOICQ5WXGKP5NNJ7ZNVUPLUXWGUNANCNFSM5MV3HEOQ>. Triage notifications on the go with GitHub Mobile for iOS<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>. You are receiving this because you were mentioned.Message ID: ***@***.******@***.***>>

0 replies

hanneoberman · 2022-01-25T10:04:14Z

hanneoberman
Jan 25, 2022
Maintainer

Hi Steve,
No, you don't need RStudio for {reprex} to work. RStudio is just a sort of user interface to run R (among other things). On the package's GitHub repo, https://github.com/tidyverse/reprex, it tells you that the reprex() function will render html for you, which you can open in your browser.

0 replies

stefvanbuuren · 2022-02-02T17:00:53Z

stefvanbuuren
Feb 2, 2022
Maintainer

Thanks for your question. Some thoughts:

I moved this to Q&A since it seems more a question about interpretation than about mice functionality;
mice.impute.2l.pmm() is part of the miceadds package, not mice, so for functionality go over there;
A reproducible example would help to articulate your question.

2 replies

skramer1958 Feb 3, 2022
Author

To create a reproducible example I will need to include a data set. I have been reading excel .csv files off my computer to enter the data, but that won't work for a reprex. How do I save the data so you all can look at it?

hanneoberman Feb 4, 2022
Maintainer

You could use this or illustrate the behavior with one of the datasets included in mice (for example tot mice::popmis data, which consists of pupils clustered within classes)?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Does 2l.pmm bias correlations downwards? #464

{{title}}

Replies: 4 comments 2 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Does 2l.pmm bias correlations downwards? #464

skramer1958 Jan 24, 2022

Replies: 4 comments · 2 replies

hanneoberman Jan 24, 2022 Maintainer

skramer1958 Jan 24, 2022 Author

hanneoberman Jan 25, 2022 Maintainer

stefvanbuuren Feb 2, 2022 Maintainer

skramer1958 Feb 3, 2022 Author

hanneoberman Feb 4, 2022 Maintainer

skramer1958
Jan 24, 2022

Replies: 4 comments 2 replies

hanneoberman
Jan 24, 2022
Maintainer

skramer1958
Jan 24, 2022
Author

hanneoberman
Jan 25, 2022
Maintainer

stefvanbuuren
Feb 2, 2022
Maintainer

skramer1958 Feb 3, 2022
Author

hanneoberman Feb 4, 2022
Maintainer