Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add depend_failed keyword argument #51

Open
MikeDacre opened this issue Nov 18, 2016 · 4 comments
Open

Add depend_failed keyword argument #51

MikeDacre opened this issue Nov 18, 2016 · 4 comments
Assignees
Milestone

Comments

@MikeDacre
Copy link
Owner

No description provided.

@MikeDacre MikeDacre added this to the 0.6.2 milestone Nov 18, 2016
@MikeDacre MikeDacre self-assigned this Nov 18, 2016
@MikeDacre MikeDacre modified the milestones: 0.7.2, 0.6.2 Aug 3, 2017
@gportella
Copy link

Hi Mike,

I'm really enjoying your tool!
I was wondering if you managed to make any progress related to this issue.

I'm missing a way to detect jobs that will never run because their dependency was never satisfied. As far as I can tell, the nodes list in the Queue objects are either empty (which I guess implies pending), or they contain the list of nodes. I can not find any way to differentiate between regular pending jobs and jobs with NODELIST(REASON) DependencyNeverSatisfied (in slurm, at least).

I guess one possible solution, besides implementing a keyword, would be to include the (REASON) string in the node list, such that the user can find them, perhaps by adapting your parse_queue() in batch_systems/slurm.py.

best

Guillem

@MikeDacre
Copy link
Owner Author

Hi Guillem,

Thanks for the comment. Unfortunately, I left my job and went to medical school, so now my time to work on fyrd is limited. I agree with you that adding a REASON value to the node list is probably a good way to go. The reason that I didn't do that is because reasons are handled differently by Torque.

Another solution would be to add a new return value for the job (in the Queue object). Currently, that includes things like 'pending' and 'running' and 'completed', you could add a 'depend_failed' value as well and then add that to the list of failed keywords.

If you would be willing, I would suggest trying to make the edits yourself and then I can review the pull request. It should be a relatively quick change, but it is unlikely I will have the time for several weeks at least.

Thanks,

Mike

@gportella
Copy link

gportella commented Nov 1, 2018

Hi Mike,

I bet medical school is very demanding, so thanks for taking the time to reply.

I sort of found a way around it, at least for my needs. Slurm accepts a kill-on-invalid-dep switch, which kills the jobs dependants as soon as the dependency fails. I had written my own class for submitting jobs - admittedly less polished than what you did -, and I just include this switch in kwargs.
After that, these type of jobs show up as failed when using fyrd, as they should, and I can take it from there.

Just by reading bits of your code and going over the documentation I could not see a way to pass kwargs to slurm. Is that possible? If so, I can remove my slurm class for job submission, since I already have your module as a dependency anyway.

I'm pretty busy myself also, but I'll send you a PR if I find the time to work on it. You know what would be cool? Async/await for jobs. I tried to combine your library and the multiprocessing module to get the output of the jobs, but somehow it crashes. Anyway, I didn't spend too much time on it, and that's another topic...

best,

Guillem

@MikeDacre
Copy link
Owner Author

Hi Guillem,

The way to do it using the 'API' in fyrd is to add it to the fyrd/batch_systems/slurm.py file in the parse_strange_options function at the bottom of the file. Otherwise you need to implement something in the primary option parsing that makes sense for both slurm and torque.

Thanks!

-Mike

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants