You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jul 10, 2024. It is now read-only.
At present, the juypter image size is very large, so that when users deploy juypter service in a new k8s cluster or node, there will be a long waiting process.
This issue is mainly to discuss the design idea of building an operator based on CRD that can connect with existing submarine services and has certain controllability / predictability. Based on a new CRD, we can automatically call the image pull action in every suitable node before the juypter service is deployed, so that every node in k8s has the corresponding image.
In this case, we need to create a CRD which contains a list of images to be obtained, the refresh time, and the pull secret key of each image (if necessary). Examples of CRD are as follows:
apiVersion: org.apache.submarine/v1kind: JupyterImagePullermetadata:
name: example-image-pullernamespace: submarinespec:
images: # the list of images to pre-pull
- name: jupyter # environment nameimage: apache/submarine:jupyter-notebook-0.7.0 # image name
- name: jupyter-gpuimage: xxx.harbor.com/5000/apache/submarine:jupyter-notebook-gpu-0.7.0auth: # docker registry authenticationusername: xxxxpassword: xxxxemail: [email protected]# Optional
- name: jupyter image: apache/submarine:jupyter-notebook-0.7.0-chineseauth:
secret: xxxx # If there is already a specified secret, we can fill in the secret name nodeSelector: {} # node selector applied to pods created by the daemonsetrefreshHours: '2'# number of hours between health checksstatus:
images:
- name: apache/submarine:jupyter-notebook-0.7.0state: success/failure/pullingmessage: Reasons for pull failure ...digest: sha256:f04468d5ec5bdcda7a6ebdd65b20a7b363f348f1caef915df4a6cc8d1eb09029nodes:
- worker1.xxxx.com
Every time submarine updates the environments, it will update the image list in CRD. After reading the spec of CRD and triggering the addition / modification, the operator can create a DaemonSet in the specified namespace (with nodeSelector). The DaemonSet will contain N (images list size) containers which can pull every image by CRD.
This operation will modify the entrypiont script in the docker image and output words like "Pulling complete", so it's a lightweight task.
Docker authentication should be provided in environment. We should consider some private clouds or private image registry (like harbor). In some cases, we need to provide the docker authentication for downloading.
We can support users to directly enter the user name and password or use the authentication information already on k8s's Secret.
At present, this chapter mainly considers GPU image. We need to consider that some k8s GPU resources may only exist on some exclusive nodes. Therefore, we need to add a nodeselector to the deployment of pod. Meanwhile, the environment also needs to add nodeselector. In this way, the GPU image pod can be deployed on the correct node.
Docker image version update strategy
Build a ConfigMap to save the image and tag information.
When the refresh hour is reached, the updated image will be compared with the latest tag / hash to distinguish whether there is an image update. If there is an update, a new pull operation is automatically triggered.
It should be noted that there may be jupyter and jupyter-gpu images under each tenant. Since the repeated image pull operation will not bring too much additional burden, we allow different tenants to have the same image resources.
There are still some contents to be designed, which will be explained later.
TODO 1: How to init/replace docker image name and authorization when deploy a new submarine service.
The text was updated successfully, but these errors were encountered:
cdmikechen
changed the title
Add a CRD to install and control jupyter image in k8s
[DESIGN] Add a CRD to install and control jupyter image in k8s
May 15, 2022
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
At present, the juypter image size is very large, so that when users deploy juypter service in a new k8s cluster or node, there will be a long waiting process.
This issue is mainly to discuss the design idea of building an operator based on CRD that can connect with existing submarine services and has certain controllability / predictability. Based on a new CRD, we can automatically call the image pull action in every suitable node before the juypter service is deployed, so that every node in k8s has the corresponding image.
In this case, we need to create a CRD which contains a list of images to be obtained, the refresh time, and the pull secret key of each image (if necessary). Examples of CRD are as follows:
Every time submarine updates the environments, it will update the image list in CRD. After reading the spec of CRD and triggering the addition / modification, the operator can create a
DaemonSet
in the specified namespace (with nodeSelector). TheDaemonSet
will contain N (images list size) containers which can pull every image by CRD.This operation will modify the entrypiont script in the docker image and output words like "Pulling complete", so it's a lightweight task.
Docker image registry authorization
Docker authentication should be provided in
environment
. We should consider some private clouds or private image registry (like harbor). In some cases, we need to provide the docker authentication for downloading.We can support users to directly enter the user name and password or use the authentication information already on k8s's
Secret
.Image nodeSelector
At present, this chapter mainly considers GPU image. We need to consider that some k8s GPU resources may only exist on some exclusive nodes. Therefore, we need to add a
nodeselector
to the deployment of pod. Meanwhile, theenvironment
also needs to addnodeselector
. In this way, the GPU image pod can be deployed on the correct node.Docker image version update strategy
Build a
ConfigMap
to save the image and tag information.When the refresh hour is reached, the updated image will be compared with the latest tag / hash to distinguish whether there is an image update. If there is an update, a new pull operation is automatically triggered.
It should be noted that there may be jupyter and jupyter-gpu images under each tenant. Since the repeated image pull operation will not bring too much additional burden, we allow different tenants to have the same image resources.
There are still some contents to be designed, which will be explained later.
The text was updated successfully, but these errors were encountered: