Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Autoscaler - General solution keep EBS for 1 day #6863

Open
5 tasks
Tracked by #6411
sanderegg opened this issue Nov 28, 2024 · 0 comments
Open
5 tasks
Tracked by #6411

Autoscaler - General solution keep EBS for 1 day #6863

sanderegg opened this issue Nov 28, 2024 · 0 comments
Assignees
Labels
a:autoscaling autoscaling service in simcore's stack
Milestone

Comments

@sanderegg
Copy link
Member

sanderegg commented Nov 28, 2024

Goal: Achieve caching of data using EBS volumes

Current situation and usual scenario

  • user A, B and C each start a service A, B, C respectively
  • service A,B, C are pending, waiting for cluster resources
  • autoscaling picks this up and provisions EC2 instances (in sim4life.io that will be 1 per service, in osparc.io it will be based on needed resources and the service might share machines)
  • once the machine(s) are up and running, the service are started there

Key take aways here

  • autoscaling does not know anything about user, project, node,
  • autoscaling cannot attach a specific EBS volume to a machine, because it does not control where a service will finally end up

Proposed changes

Option 1: Autoscaler takes care of it

Autoscaling service:

  • autoscaling can know from the docker service labels for which user/project/node the service is needed
  • when connecting the node, it can label the Docker node with user/project/node ids
  • autoscaling can then attach a EBS volume to the machine (note that the EBS volume shall be tagged with the necessary user_id/project_id/node_id as well)
  • when the service is shutdown, the autoscaling service shall remove the user/project/node labels
  • the autoscaling shall detach the EBS volume,
  • the autoscaling shall terminate the EBS volume if needed or keep it until it shall be deleted

Director-v2

  • when creating the docker services, there should be additional docker placement constraints such as node.labels.user_id==user_id, node.labels.project_id==project_id and/or node.labels.node_id==node_id

Pros/Cons:

Option 2: Dynamic-sidecar takes care of it

Dynamic-sidecar:

  • before starting the service, it shall create/re-use a EBS volume and attach it to the running EC2 instance where it is running,
  • once it is up, it needs to mount the EBS volume on the machine where it is running,
  • possibly format the drive, and mount the drive in the user containers,
  • when the service is shutdown,
  • the dynamic sidecar shall terminate or detach the drive depending on preferences,

Autoscaling:

  • shall manage the lifecycle of the EBS volumes and terminate the ones that are not needed anymore

Pros/Cons:

  • the handling of EBS caching is contained within the service that handles the use services (or a subsequent sidecar),
  • there are almost no change in the director-v2
  • there are still things to modify in autoscaling, so the complete lifecycle is divided between 2 services
  • this does not help in other github issues for now

(## Changes

  • Add groups_extra_properties setting to enable/disable cache feature
  • Add some service label to indicate the EBS shall be kept up
  • The EBS storage shall not be terminated upon EC2 shutdown procedure
  • In case there is an EBS storage available for a service for UserX/ProjectY it shall be re-used
  • The EBS volumes shall be auto-removed after X day(s))
@sanderegg sanderegg self-assigned this Nov 28, 2024
@sanderegg sanderegg added the a:autoscaling autoscaling service in simcore's stack label Nov 28, 2024
@sanderegg sanderegg added this to the Event Horizon milestone Nov 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
a:autoscaling autoscaling service in simcore's stack
Projects
None yet
Development

No branches or pull requests

1 participant