Refactoring of in-house scheduling of computational jobs to allow scaling of simcore services and reduce load on system #6666

sanderegg · 2024-11-05T11:34:17Z

The scheduling of tasks from the director-v2 has the following issues:

on each round all the computational jobs are checked (creating a lot of network calls to the dask-schedulers),
every time a new pipeline is added 1. is done again even if it was just done,
every time a pipeline is stopped 1. is done again,
it is not scalable on multiple director-v2 replicas,

--> A distributed lock shall be used to protect each pipeline separately (this way multiple replicas can take care of different pipelines),
after adding/stopping a pipeline only that one should be re-scheduled, not all of them

Tasks

Give feedback

sanderegg · 2024-12-03T13:26:22Z

closed by #6736

sanderegg mentioned this issue Nov 5, 2024

Hardening the computational backend ITISFoundation/osparc-issues#1730

Open

2 tasks

sanderegg self-assigned this Nov 5, 2024

sanderegg transferred this issue from ITISFoundation/osparc-issues Nov 5, 2024

sanderegg added the a:director-v2 issue related with the director-v2 service label Nov 5, 2024

sanderegg added this to the Event Horizon milestone Dec 3, 2024

sanderegg closed this as completed Dec 3, 2024

This was referenced Dec 4, 2024

Make director-v2 scalable/restartable #4524

Open

computational scheduler: investigate using row lock mechanism and/or last_updated timestamp on rows to allow several dv-2 to schedule the pipelines #5631

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactoring of in-house scheduling of computational jobs to allow scaling of simcore services and reduce load on system #6666

Refactoring of in-house scheduling of computational jobs to allow scaling of simcore services and reduce load on system #6666

sanderegg commented Nov 5, 2024 •

edited

Loading

Tasks

sanderegg commented Dec 3, 2024

Refactoring of in-house scheduling of computational jobs to allow scaling of simcore services and reduce load on system #6666

Refactoring of in-house scheduling of computational jobs to allow scaling of simcore services and reduce load on system #6666

Comments

sanderegg commented Nov 5, 2024 • edited Loading

Tasks

sanderegg commented Dec 3, 2024

sanderegg commented Nov 5, 2024 •

edited

Loading