Bugfix in waitForFile where fn was full file path #274
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
95f27ca
The bug corrected by
95f27ca
was identified when waiting for a worker to be provisioned after submitting a jobs on a slurm cluster. It occurs when called fromreadLog
whenfs.latency > 0
. While waiting for the log file to appear,fn
is a full file path, so isn't found in the call tolist.files
without thefull.names
argument.This solution handles
fn
as either a full file path or a file name, and provides the full file path in the timeout error message.548129f
The first bugfix revealed that
batchtools::getStatus
was returning an incorrect 'expired' status for jobs during machine provisioning, which triggeredfuture.batchtools::await
to handle the 'expired' job & terminate early, described here. This was addressed in an inelegant548129f
. It would perhaps be better to update thelog.file
directly in the registry once the value is known, rather than adding it on the fly, but I don't know the full implications of doing this. I'm also not convinced that the specifiedtimed.out
is right - it won't terminate at the same time as thereadLog -> waitForFile loop
& it may not take into account queuing, but it seems to work well in my environment.The new 'provisioning' status is strictly unnecessary, as preventing the 'expired' status would have the same effect on
future.batchtools::await
, but it feels more explicit.I don't know about compatibility with other environments, but it seemed solid when tested across 500 jobs on a Slurm cluster starting at 0 nodes & limited at 20 nodes with a queue size of 50. Previously the same series of jobs simply wouldn't run unless nodes were persistent & pre-provisioned.