Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running Gnomix for 600K individuals #52

Open
vicbp1 opened this issue Nov 15, 2024 · 0 comments
Open

Running Gnomix for 600K individuals #52

vicbp1 opened this issue Nov 15, 2024 · 0 comments

Comments

@vicbp1
Copy link

vicbp1 commented Nov 15, 2024

Dear all
I face memory issues when running gnomix for 600K individuals (with no rephasing).
We were thinking of two strategies to deal with this, and I would like to know if they make sense.

Splitting chromosomes: We considered splitting the chromosomes into two overlapping segments.
For example:
start end
region 1: 1 - 20,000,000
region 2: 15,000,000 - 40,000,000

Subsetting the dataset: Subsetting the dataset in groups of 100K individuals

Which of these two strategies will you recommend?
Would subsetting the data but running the same model for all the batches provide the best results?
Will the first strategy have memory requirements similar to those of running the entire dataset?

Thank you for your time!

Vic

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant