How to continue a crashed run #8

papaig · 2019-02-21T08:09:08Z

Dear developers,
I'm running Thunder on a cpu cluster. Despite using 8 nodes with 16 cores each, my run didn't finish in 14 days. unfortunately this 14 days is the time limit for a run on this cluster, so the run was cancelled. I would like to continue it and I wonder how I should do it. If I put the last .thu file in the json as ".thu File Storing Paths and CTFs of Images", Thunder seems to restart the run from the beginning. I chose a new folder for the output not to overwrite the files from the previous, crashed run.
I would be grateful if you could tell me how to continue the run from where it crashed.
Thank you,
Gabor

thuem · 2019-03-23T07:36:45Z

I am so sorry for the delay, as the E-mail system blocked the notification letter.

There are two situations.

First situation, you ended during global search. In this case, just put last .thu fie as ".thu File Storing Paths and CTFs of Images", and change the initial model and initial resolution to the reference / resolution you achieve in the last round of the previous round, respectively. As it should work.

Second situation, you ended after global search. In this case, despite the actions in the first situation, the "Global Search" option should be turned from true to false.

Moreover, THUNDER uses cluster resource in the way different from RELION. If you were using 8 nodes with 16 cores each, it is better to run 1 process with 16 threads on each node. Moreover, as some job managing system such as LFS restricts the number of physic cores assigned to each process, I believe that it is important to check the configuration of the job managing system, making sure that one node runs one process of THUNDER, and this process can use all CPU resource by threading. If this method does not accelerate your job, please contract us and inform us with your job information, such as number of images, boxsize and symmetry. We will compare it with our benchmark.

Best regards.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to continue a crashed run #8

How to continue a crashed run #8

papaig commented Feb 21, 2019

thuem commented Mar 23, 2019 •

edited

Loading

How to continue a crashed run #8

How to continue a crashed run #8

Comments

papaig commented Feb 21, 2019

thuem commented Mar 23, 2019 • edited Loading

thuem commented Mar 23, 2019 •

edited

Loading