Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run the pipeline on a small / medium / large genome #11

Open
davidonlaptop opened this issue Feb 21, 2015 · 7 comments
Open

Run the pipeline on a small / medium / large genome #11

davidonlaptop opened this issue Feb 21, 2015 · 7 comments
Assignees
Labels
Milestone

Comments

@davidonlaptop
Copy link
Member

With updated versions of the BDGenomics pipeline (snap, adam, avocado), use a small / medium / large genome to validate the new images and the orchestration scripts. We'll compare these results with the results from S. Bonami and the BDGenomics papers.

References

TODO: Find the data used in the Snap / Adam / Avocado papers.

@flangelier
Copy link
Contributor

Avocado is always failing at the moment...

@davidonlaptop
Copy link
Member Author

what's the error message?

@davidonlaptop
Copy link
Member Author

(let's leave comments here, for future tracking)

François Langelier 2:35 PM (22 minutes ago)
to David
2015-03-22 17:29:25 ERROR Executor:96 - Exception in task 0.0 in stage 1.0 (TID 16)
java.lang.OutOfMemoryError: GC overhead limit exceeded

@davidonlaptop
Copy link
Member Author

@codingtony
Copy link

If you can run it with jvisualvm attached you will have more idea of what
is using the memory and what is the usage of the heap in general.

What JVM parameters you are currently using ?

-tony

On Mon, Mar 23, 2015 at 3:13 PM, David Lauzon [email protected]
wrote:

Potential solution:
https://plumbr.eu/outofmemoryerror/gc-overhead-limit-exceeded


Reply to this email directly or view it on GitHub
#11 (comment).

@davidonlaptop
Copy link
Member Author

Currently, it is the default settings. I think there is no enough memory allocated to the spark workers.

Francois will is describing the steps-by-steps to reproduce the problem.

@sebastienbonami
Copy link
Member

@flangelier I think I encountered the same problem as you with Avocado! Change the values of the two lines below in the file bin/avocado-submit. 4g is probably more than what you have available in memory and the JVM can't start.

--conf spark.executor.memory=${AVOCADO_EXECUTOR_MEMORY:-4g} \
--driver-memory ${AVOCADO_DRIVER_MEMORY:-4g} \

See: https://github.com/bigdatagenomics/avocado/blob/master/bin/avocado-submit#L56-58

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants