Performance Optimization with expected wait times #1953

dmcgoldrick · 2021-02-04T17:50:40Z

dmcgoldrick
Feb 4, 2021

I am not sure how long steps should take for the on-prem uploading on a server. Currently, after upgrading the latest pipeline runner container I am getting a lot of output on stdout and am concerned about the bounds of how long it should take to upload a trio vcf and options for optimizing performance?

Describe the solution you'd like
I would like some info on the performance bounds/expectation of time to upload for a typical WGS and WES sequencing project in the documentation. Also suggestions if this becomes a long running job (e.g. overnight)? Should hail conversion and uploads take overnight? I would like to break up long running jobs for vcf to hail for on-prem conversion and ways to control cost.

hanars · 2021-02-04T20:39:21Z

hanars
Feb 4, 2021
Maintainer

How long loading takes varies wildly depending on what hardware you are running on, how many cores you have available, and how big your data is. Our largest loading project of almost 1000 genomes typically takes us 3 days to load on pretty robust hardware. Our smaller exome projects usually take a couple hours. @bw2 might have some more detailed ideas about how long things run with some of the recommended configurations in the local install readme, but theres so much variation its really hard to say anything with certainty.

There is documented way to split the loading in to 2 steps if that is what you meant by breaking up long running jobs, see the section immediately above this link: https://github.com/broadinstitute/seqr/blob/master/deploy/LOCAL_INSTALL.md#adding-a-loaded-dataset-to-a-seqr-project

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance Optimization with expected wait times #1953

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Performance Optimization with expected wait times #1953

dmcgoldrick Feb 4, 2021

Replies: 1 comment

hanars Feb 4, 2021 Maintainer

dmcgoldrick
Feb 4, 2021

hanars
Feb 4, 2021
Maintainer