Run the pipeline on a small / medium / large genome #11

davidonlaptop · 2015-02-21T03:43:00Z

With updated versions of the BDGenomics pipeline (snap, adam, avocado), use a small / medium / large genome to validate the new images and the orchestration scripts. We'll compare these results with the results from S. Bonami and the BDGenomics papers.

References

TODO: Find the data used in the Snap / Adam / Avocado papers.

flangelier · 2015-03-22T17:46:02Z

Avocado is always failing at the moment...

davidonlaptop · 2015-03-22T17:55:29Z

what's the error message?

davidonlaptop · 2015-03-22T18:58:20Z

(let's leave comments here, for future tracking)

François Langelier 2:35 PM (22 minutes ago)
to David
2015-03-22 17:29:25 ERROR Executor:96 - Exception in task 0.0 in stage 1.0 (TID 16)
java.lang.OutOfMemoryError: GC overhead limit exceeded

davidonlaptop · 2015-03-23T19:13:20Z

Potential solution: https://plumbr.eu/outofmemoryerror/gc-overhead-limit-exceeded

codingtony · 2015-03-23T19:26:32Z

If you can run it with jvisualvm attached you will have more idea of what
is using the memory and what is the usage of the heap in general.

What JVM parameters you are currently using ?

-tony

On Mon, Mar 23, 2015 at 3:13 PM, David Lauzon [email protected]
wrote:

Potential solution:
https://plumbr.eu/outofmemoryerror/gc-overhead-limit-exceeded

—
Reply to this email directly or view it on GitHub
#11 (comment).

davidonlaptop · 2015-03-23T19:33:48Z

Currently, it is the default settings. I think there is no enough memory allocated to the spark workers.

Francois will is describing the steps-by-steps to reproduce the problem.

sebastienbonami · 2015-03-25T01:13:49Z

@flangelier I think I encountered the same problem as you with Avocado! Change the values of the two lines below in the file bin/avocado-submit. 4g is probably more than what you have available in memory and the JVM can't start.

--conf spark.executor.memory=${AVOCADO_EXECUTOR_MEMORY:-4g} \
--driver-memory ${AVOCADO_DRIVER_MEMORY:-4g} \

See: https://github.com/bigdatagenomics/avocado/blob/master/bin/avocado-submit#L56-58

davidonlaptop added the task label Feb 21, 2015

davidonlaptop assigned flangelier Feb 21, 2015

davidonlaptop added this to the 0.3 milestone Feb 21, 2015

sebastienbonami mentioned this issue Mar 28, 2015

OutOfMemoryError GELOG/docker-ubuntu-avocado#2

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Run the pipeline on a small / medium / large genome #11

Run the pipeline on a small / medium / large genome #11

davidonlaptop commented Feb 21, 2015

flangelier commented Mar 22, 2015

davidonlaptop commented Mar 22, 2015

davidonlaptop commented Mar 22, 2015

davidonlaptop commented Mar 23, 2015

codingtony commented Mar 23, 2015

davidonlaptop commented Mar 23, 2015

sebastienbonami commented Mar 25, 2015

Run the pipeline on a small / medium / large genome #11

Run the pipeline on a small / medium / large genome #11

Comments

davidonlaptop commented Feb 21, 2015

References

flangelier commented Mar 22, 2015

davidonlaptop commented Mar 22, 2015

davidonlaptop commented Mar 22, 2015

davidonlaptop commented Mar 23, 2015

codingtony commented Mar 23, 2015

davidonlaptop commented Mar 23, 2015

sebastienbonami commented Mar 25, 2015