Skip to content

Commit

Permalink
Merge pull request #43 from williamFalcon/slurm_timer_fix
Browse files Browse the repository at this point in the history
Fix for early slurm termination.
  • Loading branch information
williamFalcon authored Nov 30, 2018
2 parents 5b49351 + 89f2c60 commit 6316ca1
Showing 1 changed file with 4 additions and 1 deletion.
5 changes: 4 additions & 1 deletion test_tube/hpc.py
Original file line number Diff line number Diff line change
Expand Up @@ -247,11 +247,14 @@ def __run_experiment(self, train_function):
stop_in_n_seconds = max(stop_in_n_seconds, 10) # make sure we don't go below the 5 mins

# schedule timer to interrupt training
threading.Timer(stop_in_n_seconds, self.call_save).start()
timer_instance = threading.Timer(stop_in_n_seconds, self.call_save)
timer_instance.start()

# run training
train_function(self.hyperparam_optimizer, self, {})

timer_instance.cancel()

except Exception as e:
print('Caught exception in worker thread', e)

Expand Down

0 comments on commit 6316ca1

Please sign in to comment.