-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(opensearch): capture logs from Dask cluster pods #616
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #616 +/- ##
==========================================
- Coverage 75.57% 75.44% -0.13%
==========================================
Files 17 17
Lines 1883 1898 +15
==========================================
+ Hits 1423 1432 +9
- Misses 460 466 +6
|
553cb69
to
a3c6170
Compare
a3c6170
to
7d9e4c5
Compare
7d9e4c5
to
0330a34
Compare
@@ -63,11 +63,13 @@ def __init__( | |||
os_client: OpenSearch | None = None, | |||
job_index: str = "fluentbit-job_log", | |||
workflow_index: str = "fluentbit-workflow_log", | |||
dask_index: str = "fluentbit-dask_log", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
todo: For the release notes, please amend the commit log headline scope to opensearch
and capitalise Dask:
feat(opensearch): capture logs from Dask cluster pods (#616)
|
||
def fetch_dask_worker_logs(self, workflow_id: str) -> str | None: | ||
""" | ||
Fetch logs of the workers of a Dask cluster. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we can document -- here or elsewhere in the code documentation or in the main commit log message body -- that the logs collected from Dask workers are "propagated" to all REANA job logs, so that if there are N jobs using the same cluster, the Dask logs are currently "multiplicated" and they show under each job log. So that we don't forget this by-product of the current approach. (And we could open an issues about this later.)
Currently, the logs collected from Dask workers and scheduler are propogated to all REANA job logs. It is not ideal since Dask logs are multiplicated for different steps which use the same Dask cluster and the multiplicated logs are shown for each step which might be confusing and not user-friendly. This change requires a larger architectural change and is deferred to a future commit. Closes reanahub#610
0330a34
to
e9e272e
Compare
Currently, the logs collected from Dask workers and scheduler are propogated to all REANA job logs. It is not ideal since Dask logs are multiplicated for different steps which use the same Dask cluster and the multiplicated logs are shown for each step which might be confusing and not user-friendly. Seperating Dask logs and job logs requires a larger architectural change and is deferred to a future commit. Closes reanahub#610
e9e272e
to
16d9bc3
Compare
This commit collects logs from Dask scheduler and workers and propagates\nthem to all REANA jobs that are using the same Dask cluster. This is not\nideal, since Dask logs can become thusly duplicated for different\nworkflow steps of the workflow, which could be confusing for the user.\n\nHowever, when a user uses Dask to parallelise the workflow jobs, usually\nthe workflow steps are defined only within Dask, so this situation does\nnot occur. Hence we can afford doing this in usual real-life conditions.\n\nSeparating Dask scheduler and worker logs from regular Kubernetes job\nlogs would require a larger architectural change and is therefore\ndeferred to a future commit. Closes reanahub#610
16d9bc3
to
fc03fb9
Compare
This commit collects logs from Dask scheduler and workers and propagates them to all REANA jobs that are using the same Dask cluster. This is not ideal, since Dask logs can become thusly duplicated for different workflow steps of the workflow, which could be confusing for the user. However, when a user uses Dask to parallelise the workflow jobs, usually the workflow steps are defined only within Dask, so this situation does not occur. Hence we can afford doing this in usual real-life conditions. Separating Dask scheduler and worker logs from regular Kubernetes job logs would require a larger architectural change and is therefore deferred to a future commit. Closes reanahub#610
fc03fb9
to
51fad95
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Works nicely 👍
This PR adds the collection of logs for Dask pods, namely scheduler and workers.
Closes #610