You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have suddenly started to experience unexpected crashes on betzy. I am getting the following type of traceback repeatedly using the nodes b1373-b1375,b1382
When I excluded these nodes from the submission the model ran. I have notified sigma2 about this.
For noresm2_5_alpha07 - to exclude nodes from a job - the easies thing to do is to edit your $SRCROOT/ccsm_config/machines/betzy/env_batch.xml and add the following line below
<directives>
<directive> --ntasks={{ total_tasks }}</directive>
<directive> --export=ALL</directive>
<directive> --switches=1</directive>
<directive> --exclude=b1373,b1374,b1375,b1382</directive> <=== add this line
</directives>
The text was updated successfully, but these errors were encountered:
I have suddenly started to experience unexpected crashes on betzy. I am getting the following type of traceback repeatedly using the nodes b1373-b1375,b1382
208: [b1374:545909:0:545909] ud_ep.c:278 Fatal: UD endpoint 0xaff0a40 to : unhandled timeout error 208: ==== backtrace (tid: 545909) ==== 208: 0 0x000000000005e810 uct_ud_ep_deferred_timeout_handler() .....
When I excluded these nodes from the submission the model ran. I have notified sigma2 about this.
For noresm2_5_alpha07 - to exclude nodes from a job - the easies thing to do is to edit your
$SRCROOT/ccsm_config/machines/betzy/env_batch.xml
and add the following line belowThe text was updated successfully, but these errors were encountered: