-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
workflow-run-manager: create new pending status #363
Closed
diegodelemos opened this issue
Feb 23, 2021
· 0 comments
· Fixed by reanahub/reana-db#123, reanahub/reana-server#350, #371, reanahub/reana-client#503 or reanahub/reana-ui#181
Closed
workflow-run-manager: create new pending status #363
diegodelemos opened this issue
Feb 23, 2021
· 0 comments
· Fixed by reanahub/reana-db#123, reanahub/reana-server#350, #371, reanahub/reana-client#503 or reanahub/reana-ui#181
Labels
Comments
diegodelemos
changed the title
workflow-run-manager: workflow-engine doesn't start and workflow gets stuck in running state
workflow-run-manager: create new pending status
Apr 20, 2021
audrium
added a commit
to audrium/reana-db
that referenced
this issue
Apr 21, 2021
audrium
added a commit
to audrium/reana-server
that referenced
this issue
Apr 21, 2021
audrium
added a commit
to audrium/reana-workflow-controller
that referenced
this issue
Apr 21, 2021
audrium
added a commit
to audrium/reana-client
that referenced
this issue
Apr 21, 2021
audrium
added a commit
to audrium/reana-db
that referenced
this issue
Apr 21, 2021
audrium
added a commit
to audrium/reana-workflow-controller
that referenced
this issue
Apr 21, 2021
audrium
added a commit
to audrium/reana-workflow-controller
that referenced
this issue
Apr 22, 2021
audrium
added a commit
to audrium/reana-server
that referenced
this issue
Apr 22, 2021
audrium
added a commit
to audrium/reana-workflow-controller
that referenced
this issue
Apr 22, 2021
audrium
added a commit
to audrium/reana-db
that referenced
this issue
Apr 22, 2021
audrium
added a commit
to audrium/reana-db
that referenced
this issue
Apr 23, 2021
audrium
added a commit
to audrium/reana-server
that referenced
this issue
Apr 23, 2021
audrium
added a commit
to audrium/reana-server
that referenced
this issue
Apr 26, 2021
audrium
added a commit
to audrium/reana-server
that referenced
this issue
Apr 26, 2021
audrium
added a commit
to audrium/reana-server
that referenced
this issue
Apr 26, 2021
audrium
added a commit
to audrium/reana-client
that referenced
this issue
Apr 27, 2021
audrium
added a commit
to audrium/reana-workflow-controller
that referenced
this issue
Apr 27, 2021
audrium
added a commit
to audrium/reana-workflow-controller
that referenced
this issue
Apr 27, 2021
audrium
added a commit
to audrium/reana-ui
that referenced
this issue
Apr 27, 2021
This was referenced Apr 27, 2021
audrium
added a commit
to audrium/reana-workflow-controller
that referenced
this issue
Apr 28, 2021
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Issue
A general problem in REANA regarding workflows stuck in running state was fixed in reanahub/reana#478. However, there is still a corner case: there is no way to avoid stuck
running
workflows ifworkflow-engine
pod never starts. This is also a UX issue because the user gets a confusing message, workflow statusrunning
, when in reality this is not what is happening, the workflow is pending to be scheduled.Cause
The root cause is that we set the status to running in RWC before the actual action takes place, it is not guaranteed for the workflow to start any time soon or start at all (it depends on an external system, Kubernetes). And if the workflow doesn't start, there won't be status changes.
Solutions
A possible solution for this would be to introduce a new status (see current statuses), e.g.
pending
(TBD better name). This way the flow would look like follows:queued
(set by REANA-Server just before actually queuing it)pending
and requests Kubernetes to start the workflow engineworkflow-engine
pod -> the workflow engine itself sets the status torunning
when it starts its execution (adding it to the factorycreate_workflow_engine_command
so all workflow engines behave the same)Edit: this has to be done per engine now, since the factory is not yet merged to the latest master. Dedicated issue.
This way if a workflow gets stuck in
pending
status, we could allow deletion for workflows inpending
status. As for orphan workflow engine pods that could result from this, they could be garbage collected (e.g. for alldeleted
workflows, check if workflow engine exists and clean up).Other considerations related to create a new workflow status should be taken into account, for example:
pending
state for a long time to also be stuck.The text was updated successfully, but these errors were encountered: