Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

controlling population stratification #2

Open
blueskypie opened this issue Oct 2, 2018 · 1 comment
Open

controlling population stratification #2

blueskypie opened this issue Oct 2, 2018 · 1 comment

Comments

@blueskypie
Copy link

Hi,
Thank you for the paper and the step-by-step tutorial! Those are of great help!

May I ask a question regarding your paper? It says that the following:

Individuals who are outliers based on the MDS analysis should be removed from further analyses. After the exclusion of these individuals, a new MDS analysis must be conducted, and its main components need to be used as covariates in the association tests in order to correct for any remaining population stratification within the population.

Does it mean those MDS components should be added as covariates regardless whether they are significantly associated with the trait? if so, could you explain why?

@ttbek
Copy link

ttbek commented Aug 6, 2019

Yes, they are correcting for systemic bias independent of the trait in the sense that any association or lack thereof has no bearing on the MDS calculations. Once could substitute Principal Component Analysis PCA at this stage for MDS, as I understand it, the only difference is in the projection to 2D. If they aren't associated with the trait, then they shouldn't change the outcome much anyway but should be included as a matter of principle. Still, statistical significance is probably too high of a bar to set here and there are likely effects that are noteworthy despite not being significant. Perhaps a better test of whether or not to add them is the relative effect of each subsequent component, i.e. if you consider 10 MDS components vs. 20. At some point there is usually a large fall off (at least if it is like PCA, which I'm pretty sure it is) and it isn't really useful to include more components beyond that point.

http://www.cog-genomics.org/plink/1.9/strat

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants