Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Image change trigger plugin exclude image field from reconciliation #133

Conversation

shalberd
Copy link

@shalberd shalberd commented Jul 25, 2023

fixes kubeflow/notebooks#98

to be tested in conjunction with ODH dashboard PR-800

Looking for feedback before thinking of merging. AFIK, this should be pretty close to being mergeable, though.

seems to build alright locally, except unclear to me how to update the common/components, where reconcilehelper lies.
Got an error that is probably easy to resolve and dependent on the build chain:

kubeflow/kubeflow#13 51.49 controllers/notebook_controller.go:213:58: too many arguments in call to "github.com/kubeflow/kubeflow/components/common/reconcilehelper".CopyStatefulSetFields                                                                                                     
kubeflow/kubeflow#13 51.49 	have (*"k8s.io/api/apps/v1".StatefulSet, *"k8s.io/api/apps/v1".StatefulSet, bool, []string, logr.Logger)
kubeflow/kubeflow#13 51.49 	want (*"k8s.io/api/apps/v1".StatefulSet, *"k8s.io/api/apps/v1".StatefulSet)

that Problem was resolved by referencing our version of the reconcilehelper utilities-module under /components/common in notebook controlller go.mod:

// use our version of kubeflow/components/common module for reconcilehelper utility differences related to image-field reconciliation and affected containers exclude
replace (
	github.com/kubeflow/kubeflow/components/common => ../common
)

Key idea:

Updating notebook controller logic and reconcilehelper util logic to take into account optional Openshift image policy admission plug-in image-change trigger annotation with one or more containers and their related imagestreams (json array) that leads to container image-field update and to avoid overwriting the updated image-field value by the notebook controller.

Discussed problem with @bparees , received input from @VaishnaviHire, member of Open Data Hub (a RedHat project for Openshift) team. Also discussed how to resolve the issue in Notebook reconciler with Ben.

Context: Notebook yaml contains the (optional) metadata annotation

 annotations:
    image.openshift.io/triggers: '[{"from":{"kind":"ImageStreamTag","name":"s2i-generic-data-science-notebook:v0.0.5", "namespace":"odhsven"},"fieldPath":"spec.template.spec.containers[?(@.name==\"jupyter-nb-sven\")].image"}]'

referencing imagestream name and tag, imagestream namespace, and fieldPath reference to the image-value to be replaced by the image policy admission plug-in for a given container name. This annotation can reference one or more containers, in the form of a json array. This image-name resolve by the image policy admission plug-in happens in basic Kubernetes objects, not in the Notebook yaml itself, but in the StatefulSet derived from the Notebook yaml. The resolved image-name is set briefly after application of the StatefulSet to the server.

Or, referencing the openshift documentation:

When one of the core Kubernetes resources contains both a pod template and this annotation, OpenShift Container Platform attempts to update the object by using the image currently associated with the image stream tag that is referenced by trigger. The update is performed against the fieldPath specified.

To ensure that image-fields referenced via this annotation in the StatefulSet are not overwritten by the kubeflow notebook controller reconcile process, we set image-fields-values before change and after change to be equal, thus leading to DeepEqual returning false even when the image-field value changes.

Because the annotation is optional, checks are introduced in the code to ensure any custom logic is only done when the annotation is present.

Background:

making image-field value independent of internal openshift registry and supporting ImageContentSourcePolicy for Openshift environments with a third-party Docker repo, like VMWare Harbor. This is archieved by using the imagestream abstraction. Referencing imagestream name and tag instead of image-digest-values directly. Imagestreams are kind of an abstraction in openshift.

See what happens to the image-field cause of the image policy admission plug-in doing its work:

  • Without open shift registry:

<image stream name>:<image stream tag> references in image-field or annotation of a Kubernetes object (Deployment, StatefulSet, Pod etc.) always resolve to external image reference from imagestream spec.tags[tagname].from.name regardless of whether spec.tags[tagname].referencePolicy.type is set to Local or Source in image stream--> perfect for air gapped and ImageContentSourcePolicy use or if one just wants to pull directly from the external location when there is no internal openshift registry.

containers:
    - name: testissource
      image: >-
         quay.io/thoth-station/s2i-generic-data-science notebook@sha256:3f619c61501218f03d39d97541336dee024f446e64f3a47e2bc7e62cddeb2e58
  • With openshift registry:

if spec.tags[tagname].referencePolicy.type of the imagestream is set to Source then <image stream name>:<image stream tag> references in image-field or annotation of a Kubernetes object (Deployment, StatefulSet, Pod etc.) resolves to the remote repo location:

containers:
    - name: testissource
      image: >-
         quay.io/thoth-station/s2i-generic-data-science-notebook@sha256:3f619c61501218f03d39d97541336dee024f446e64f3a47e2bc7e62cddeb2e58

if spec.tags[tagname].referencePolicy.type of the imagestream is set to Local then <image stream name>:<image stream tag> references in image-field or annotation of a Kubernetes object (Deployment, StatefulSet, Pod etc.) resolves to the namespace-specific location of the image in the open shift-internal registry:

containers
    - name: testisreflocal
      image: >-
        image-registry.openshift-image-registry.svc:5000/testis/jupyter-tensorflow-notebook@sha256:fc52e4fbc8c1c70dfa22dbfe6b0353f5165c507c125df4438fca6a3f31fe976e

@openshift-ci
Copy link

openshift-ci bot commented Jul 25, 2023

Hi @shalberd. Thanks for your PR.

I'm waiting for a opendatahub-io member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci
Copy link

openshift-ci bot commented Jul 25, 2023

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign harshad16 for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@shalberd
Copy link
Author

@harshad16 copy / cherry-pick of changes from v1.6 branch in my fork, based on PR-72, this time pull request to v1.7

@atheo89
Copy link
Member

atheo89 commented Jul 25, 2023

/ok-to-test

@atheo89
Copy link
Member

atheo89 commented Jul 25, 2023

Hi @shalberd, thanks for opening this PR against v1.7-branch.

I reviewed the functionality of this PR using the quay.io/opendatahub/kubeflow-notebook-controller:pr-133 for the notebook controller and the quay.io/opendatahub/odh-dashboard:pr-800 image for the dashboard.

I found the following insights:

When you update the tag into PodSec

image: >-
   image-registry.openshift-image-registry.svc:5000/redhat-ods-applications/s2i-minimal-notebook:2023.1a

The annotation on the CR isn't reconciling properly and still shows the old one.

metadata:
  annotations:
    image.openshift.io/triggers: >-
      [{"from":{"kind":"ImageStreamTag","name":"s2i-minimal-notebook:2023.1",
      "namespace":"rhods-notebooks"},"fieldPath":"spec.template.spec.containers[?(@.name==\"jupyter-nb-user-2dadri\")].image"}]

The same for the JUPYTER_IMAGE

            - name: JUPYTER_IMAGE
              value: 's2i-minimal-notebook:2023.1'

However, into notebook-controller logs, you will see that the trigger identifies the update:

INFO controllers.Notebook Image Change Trigger Annotation is set  {"notebook": "rhods-notebooks/jupyter-nb-user-2dadri", "excluding new container image-field value": "image-registry.openshift-image-registry.svc:5000/redhat-ods-applications/s2i-minimal-notebook:2023.1a", "from DeepEqual and making it the single version of truth by making from/image equal to to/image for container name": "jupyter-nb-user-2dadri"}

So, I think something is missing on the 800 PR when you update the CR.

Another thing that I found is that when you trigger a notebook via Data Science Project there is no existence of the image.openshift.io/triggers annotation on the StatfulSet, and some other times doesn't create the notebook at all.

@shalberd
Copy link
Author

shalberd commented Jul 25, 2023

@atheo89 tested behavior with Notebook CR assembled by dashboard PR 800, notice the image trigger annotation pointing to imagestream name and tag and the notebook container as target / fieldPath.

Regarding dashboard PR 800, notice the env var JUPYTER_IMAGE now contains
imagestreamname:imagestreamtag,
as compared to beforehand internalopenshiftrepo:5000/mynamespace/imagestreamname:imagestreamtag

Also notice that in the Notebook CR, the image-field still just contains a placeholder,

image: jupyter-nb-sven

be be filled in later in the StatefulSet generated by kubeflow notebook controller from Notebook CR by the help of the Openshift image change trigger plugin.

apiVersion: kubeflow.org/v1
kind: Notebook
metadata:
  annotations:
    image.openshift.io/triggers: >-
      [{"from":{"kind":"ImageStreamTag","name":"jupyter-datascience-notebook:2023.1",
      "namespace":"tst-analytics"},"fieldPath":"spec.template.spec.containers[?(@.name==\"jupyter-nb-sven\")].image"}]
    notebooks.opendatahub.io/inject-oauth: 'true'
    notebooks.opendatahub.io/last-image-selection: 'jupyter-datascience-notebook:2023.1'
    notebooks.opendatahub.io/last-size-selection: Small
    notebooks.opendatahub.io/oauth-logout-url: >-
      https://odh-dashboard-tst-analytics.apps.ocp4test.infra.com/notebookController/sven/home
    opendatahub.io/link: >-
      https://jupyter-nb-sven-tst-analytics.apps.ocp4test.infra.com/notebook/tst-analytics/jupyter-nb-sven
    opendatahub.io/username: sven
...
          env:
           - name: JUPYTER_IMAGE
              value: 'jupyter-datascience-notebook:2023.1'
            - name: PIP_CERT
              value: /etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem
            - name: REQUESTS_CA_BUNDLE
              value: /etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem
          ports:
            - containerPort: 8888
              name: notebook-port
              protocol: TCP
          imagePullPolicy: IfNotPresent
          volumeMounts:
            - mountPath: /opt/app-root/src
              name: jupyterhub-nb-sven-pvc
            - mountPath: /etc/pki/ca-trust/extracted/pem
              name: trusted-ca
              readOnly: true
          image: jupyter-nb-sven

Generated StatefulSet, this is where it gets interesting:

kind: StatefulSet
apiVersion: apps/v1
metadata:
  annotations:
    image.openshift.io/triggers: >-
      [{"from":{"kind":"ImageStreamTag","name":"jupyter-datascience-notebook:2023.1",
      "namespace":"tst-analytics"},"fieldPath":"spec.template.spec.containers[?(@.name==\"jupyter-nb-sven\")].image"}]
  resourceVersion: '836755140'
  name: jupyter-nb-sven
...     
          env:
            - name: JUPYTER_IMAGE
              value: 'jupyter-datascience-notebook:2023.1'
           - name: PIP_CERT
              value: /etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem
            - name: REQUESTS_CA_BUNDLE
              value: /etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem
            - name: NB_PREFIX
              value: /notebook/tst-analytics/jupyter-nb-sven
          ports:
            - name: notebook-port
              containerPort: 8888
              protocol: TCP
          imagePullPolicy: IfNotPresent
          volumeMounts:
            - name: jupyterhub-nb-sven-pvc
              mountPath: /opt/app-root/src
            - name: trusted-ca
              readOnly: true
              mountPath: /etc/pki/ca-trust/extracted/pem
            - name: shm
              mountPath: /dev/shm
          terminationMessagePolicy: File
          image: >-
            quay.io/opendatahub/workbench-images@sha256:95a3de2b2412679afd39e756a18586ac1c3fa7c2e4769df7f76e9d0338138577
          workingDir: /opt/app-root/src

Notice how the image-field gets auto-completed and replaced with the value of tag.from.name of that imagestream tag:

quay.io/opendatahub/workbench-images@sha256:95a3de2b2412679afd39e756a18586ac1c3fa7c2e4769df7f76e9d0338138577

This is because I do not have an internal openshift registry on that cluster. If there were an internal openshift registry, the value put in there is either the same if spec.tags[tagname].referencePolicy.type is Source or the internal registry location if spec.tags[tagname].referencePolicy.type were Local, see main body of this PR.

what is important to realize: the kubeflow notebook controller makes a statefulset based on notebook CR. But the image field lookup only happens at level StatefulSet. And the reconcilation exclude logic for all containers mentioned in the image trigger annotation prevents that the real image-value in the StatefulSet quay.io/opendatahub/workbench-images@sha256:95a3de2b2412679afd39e756a18586ac1c3fa7c2e4769df7f76e9d0338138577 gets replaced again by the placeholder dummy from Notebook CR image: jupyter-nb-sven so that the openshift image change trigger plugin can do its work and its change stays in place.

@atheo89
Copy link
Member

atheo89 commented Jul 25, 2023

@harshad16 PTAL

@shalberd
Copy link
Author

notebook-controller-deployment manager container log showing that in statefulset reconciliation, image field of container name jupyter-nb-sven is excluded from reconciliation with original notebook image field value.

This is really in essence what this PR is about: when such an image change trigger annotation is present, the logic ensures that image field replacement by openshift mechanism stays current, regardless of what the initial image field value of container in notebook CR was. No more overwriting the change on reconciliation.

1.6902889798891373e+09	INFO	controllers.Notebook	Reconciliation loop started	{"notebook": "tst-analytix/jupyter-nb-sven"}
1.690288979889268e+09	INFO	controllers.Notebook	Image Change Trigger Annotation is set	{"notebook": "tst-analytics/jupyter-nb-sven", "excluding new container image-field value": "quay.io/opendatahub/workbench-images@sha256:95a3de2b2412679afd39e756a18586ac1c3fa7c2e4769df7f76e9d0338138577", "from DeepEqual and making it the single version of truth by making from/image equal to to/image for container name": "jupyter-nb-sven"}
1.6902889798893013e+09	INFO	controllers.Notebook	Updating StatefulSet	{"notebook": "tst-analytics/jupyter-nb-sven", "namespace": "tst-analytics", "name": "jupyter-nb-sven"}
1.6902889798994193e+09	INFO	controllers.Notebook	Initializing Notebook CR Status	{"notebook": "tst-analytics/jupyter-nb-sven"}
1.6902889798994467e+09	INFO	controllers.Notebook	Calculating Notebook's  containerState	{"notebook": "tst-analytics/jupyter-nb-sven"}
1.6902889798994508e+09	INFO	controllers.Notebook	Updating Notebook CR state: 	{"notebook": "tst-analytics/jupyter-nb-sven", "state": {"running":{"startedAt":"2023-07-25T12:42:47Z"}}}
1.6902889798994691e+09	INFO	controllers.Notebook	Calculating Notebook's Conditions	{"notebook": "tst-analytics/jupyter-nb-sven"}
1.6902889798994737e+09	INFO	controllers.Notebook	Updating Notebook CR Status	{"notebook": "tst-analytics/jupyter-nb-sven", "status": {"conditions":[{"type":"Initialized","status":"True","lastProbeTime":"2023-07-25T12:42:59Z","lastTransitionTime":"2023-07-25T12:42:41Z"},{"type":"Ready","status":"True","lastProbeTime":"2023-07-25T12:42:59Z","lastTransitionTime":"2023-07-25T12:42:59Z"},{"type":"ContainersReady","status":"True","lastProbeTime":"2023-07-25T12:42:59Z","lastTransitionTime":"2023-07-25T12:42:59Z"},{"type":"PodScheduled","status":"True","lastProbeTime":"2023-07-25T12:42:59Z","lastTransitionTime":"2023-07-25T12:42:41Z"}],"readyReplicas":1,"containerState":{"running":{"startedAt":"2023-07-25T12:42:47Z"}}}}

@shalberd
Copy link
Author

shalberd commented Jul 25, 2023

@atheo89 I am looking at your observations, thank you. Can you post your StatefulSet, similar to what I did? Your input is valuable because you have the internal openshift registry enabled, which I do not.

Ah, I see what you are doing, you mean the scenario of a notebook tag change, ok.

When you update the tag into PodSec
The annotation on the CR isn't reconciling properly and still shows the old one.

If you want to refer to a new imagestream tag, change that in the Notebook CR annotation, not in the image field of the container directly and save the notebook CR.

The annotation is the source of truth basically, not the image-field. The annotation tells openshift image trigger change plugin to fill-in the correct container image field value based on the imagestream name and tag in the annotation.

Bildschirmfoto 2023-07-26 um 10 22 23

Let me have a look at dashboard PR 800 code, could be it's not up to date with main dashboard branch.
How did you change the imagestream tag used for the notebook? Via dashboard GUI?

My pod jupyter-nb-sven is running fine and the notebook started from dashboard.

I will have a look at Data Science Projects Worbench, so far only created a Notebook CR from Jupyter tile traditional outside DSP.

@shalberd
Copy link
Author

shalberd commented Jul 25, 2023

updating PR-800 branch with latest code from odh-dashboard

Bildschirmfoto 2023-07-25 um 15 48 41

the old image pr-800 is too old

Bildschirmfoto 2023-07-25 um 15 49 55

I am trying again once it is ready at

https://quay.io/repository/opendatahub/odh-dashboard?tab=tags

@atheo89 25 July 23:45 CET: new odh-dashboard image built

Digest: sha256:2c08e88759e7a4ed9a66e0d65a60025d0dee5eddd282c6827fe37b4ea73f7be0
Status: Downloaded newer image for quay.io/opendatahub/odh-dashboard:pr-800

With that latest build of odh-dashboard, I have also been able to create a workbench and run it in Data Science Project namespace. The Notebook CR and related StatefulSet was created correctly, too.

Bildschirmfoto 2023-07-26 um 08 58 02

Bildschirmfoto 2023-07-26 um 10 13 48

I am now trying to update an existing workbench in DSP to a new imagestream tag, checking if it updates the annotation in the Notebook CR correctly. Edit Workbench is what that is called, I think.

@shalberd
Copy link
Author

shalberd commented Jul 26, 2023

Confirmed working:

When I edit a data science projects workbench, selecting a different tag for a given imagestream (could also have been an entirely different imagestream name and tag), the Notebook CR gets updated in the annotation with the new imagestream tag, the StatefulSet CR image change trigger annotation, too, and the image-field of the notebook container gets populated with the correct new value for the new imagestream tag.

Bildschirmfoto 2023-07-26 um 09 03 07

Bildschirmfoto 2023-07-26 um 09 03 22

Bildschirmfoto 2023-07-26 um 10 15 38

Bildschirmfoto 2023-07-26 um 09 04 43

notebook controller log in notebook-controller-deployment:

1.6903550669246228e+09	INFO	controllers.Notebook	Reconciliation loop started	{"notebook": "tst-dsp/testdatasciencedsp"}
1.6903550669247572e+09	INFO	controllers.Notebook	Image Change Trigger Annotation is set	{"notebook": "tst-dsp/testdatasciencedsp", "excluding new container image-field value": "quay.io/opendatahub/notebooks@sha256:5df71f5542d2e0161f0f4342aa9a390679d72dc6fae192fd8da1e5671b27e8d4", "from DeepEqual and making it the single version of truth by making from/image equal to to/image for container name": "testdatasciencedsp"}
1.690355066924785e+09	INFO	controllers.Notebook	Updating StatefulSet	{"notebook": "tst-dsp/testdatasciencedsp", "namespace": "tst-dsp", "name": "testdatasciencedsp"}
1.6903550669940288e+09	INFO	controllers.Notebook	Initializing Notebook CR Status	{"notebook": "tst-dsp/testdatasciencedsp"}
1.6903550669940553e+09	INFO	controllers.Notebook	Updating Notebook CR state: 	{"notebook": "tst-dsp/testdatasciencedsp", "state": {"running":{"startedAt":"2023-07-26T07:04:16Z"}}}
1.6903550669940724e+09	INFO	controllers.Notebook	Calculating Notebook's Conditions	{"notebook": "tst-dsp/testdatasciencedsp"}

So to summarize, I see quay.io/opendatahub/kubeflow-notebook-controller:pr-133 and newest quay.io/opendatahub/odh-dashboard:pr-800 working well both in the jupyter tile as well as in a data science project.

@shalberd
Copy link
Author

shalberd commented Jul 26, 2023

It'd be interesting to see how this combo behaves when on an environment with the internal openshift docker registry, as I think all your environments are.

Also, I have not tested this yet with what you all call Bring Your Own Notebook. As long as a valid imagestream is behind that BYON, that, too, should be ok.

@shalberd
Copy link
Author

shalberd commented Jul 26, 2023

Regarding Notebook-to-StatefulSet reconciliation:

things like updating other aspects of a workbench podspec, e.g. requests and limits for CPU and memory via GUI work, too.

The Notebook CR is updated in the podSpec, the StatefulSet takes the new requests and limits from the notebook, while keeping the container image-field as-is.
Bildschirmfoto 2023-07-26 um 09 30 43

Bildschirmfoto 2023-07-26 um 09 33 49

@atheo89
Copy link
Member

atheo89 commented Jul 26, 2023

Thank you for making it work with DSP as well. I can confirm that I now see the annotation on the DSP statefulset.

However, today my cluster is experiencing issues, and I'm unable to spawn a notebook via the Jupyter tile. Yesterday, I checked, and it was working as expected. You can refer to my comment here for more details.

@shalberd
Copy link
Author

shalberd commented Jul 26, 2023

@atheo89 confims that on an environment with openshift internal registry, for an imagestream tag whose referencePolicy is set to Local, the openshift registry location is filled in the container image field, very good to see that working as described above in the main body of the PR

Bildschirmfoto 2023-07-26 um 11 12 18

@atheo89
Copy link
Member

atheo89 commented Jul 27, 2023

This PR works together with the changes on the dashboard side introduced here -> opendatahub-io/odh-dashboard#800
/lgtm

@atheo89
Copy link
Member

atheo89 commented Jul 27, 2023

Hey @shalberd, would you mind squashing the commits, please? It would be a great help! Thanks a bunch! 😊

@shalberd shalberd force-pushed the image_change_trigger_plugin_exclude_image_field_from_reconciliation branch from 797172e to fe7f6e6 Compare July 27, 2023 11:23
@openshift-ci openshift-ci bot removed the lgtm label Jul 27, 2023
@shalberd
Copy link
Author

shalberd commented Jul 27, 2023

Hi @atheo89 yup, just thought of the same myself, down to two commits, with meaningful messages in themselves. Good point, thank you.

Additional point regarding the built docker images and referencing them in manifests:

opendatahub-io/odh-dashboard#800 (comment)

@atheo89
Copy link
Member

atheo89 commented Jul 31, 2023

/lgtm

@openshift-ci openshift-ci bot added the lgtm label Jul 31, 2023
@shalberd
Copy link
Author

shalberd commented Sep 5, 2023

Hi @harshad16 can you do a final review, please? See Adrianas and my history above. Dashboard team cannot proceed until this is formally approved. Thank you.

@shalberd shalberd force-pushed the image_change_trigger_plugin_exclude_image_field_from_reconciliation branch from fe7f6e6 to c19a109 Compare October 19, 2023 12:08
@openshift-ci openshift-ci bot removed the lgtm label Oct 19, 2023
@openshift-ci
Copy link

openshift-ci bot commented Oct 19, 2023

New changes are detected. LGTM label has been removed.

@shalberd shalberd force-pushed the image_change_trigger_plugin_exclude_image_field_from_reconciliation branch from ee0cfd5 to 5144d1b Compare October 24, 2023 07:46
@shalberd
Copy link
Author

@harshad16 unit tests working again, e2e, too, all jobs succeeded, thank you for your deep dive into openshift CI.

@shalberd shalberd force-pushed the image_change_trigger_plugin_exclude_image_field_from_reconciliation branch 2 times, most recently from 252f06a to 10e7c77 Compare October 24, 2023 18:32
@shalberd shalberd force-pushed the image_change_trigger_plugin_exclude_image_field_from_reconciliation branch 2 times, most recently from fdce016 to 97f256e Compare October 27, 2023 08:14
…tatefulset if present in notebook, to exclude image-fields from compare, to keep image-field to new value during reconcile.

use our version of kubeflow/components/common module for reconcilehelper utility differences related to image-field reconciliation and affected containers exclude
@shalberd shalberd force-pushed the image_change_trigger_plugin_exclude_image_field_from_reconciliation branch from 784bf70 to d02cf71 Compare October 27, 2023 08:18
@shalberd
Copy link
Author

shalberd commented Oct 27, 2023

latest image working. Test setup manifests with built images described here: opendatahub-io/odh-dashboard#800 (comment)

1.6983970511161025e+09 INFO controllers.Notebook Image Change Trigger Annotation is set {"notebook": "datascienceproject/visualstudio", "excluding new container image-field value": "registry.mycloud.com/analytics/workbench-images@sha256:75d4623d1af82361eab8be7ea6b318989828ff887a295a36abdc47eb1845c027", "from DeepEqual and making it the single version of truth by making from/image equal to to/image for container name": "visualstudio"}
1.698397051116143e+09 INFO controllers.Notebook Updating StatefulSet {"notebook": "datascienceproject/visualstudio", "namespace": "datascienceproject", "name": "visualstudio"}
1.6983970511257172e+09 INFO controllers.Notebook Initializing Notebook CR Status {"notebook": "datascienceproject/visualstudio"}

if I manually create a notebook CR with wrong json annotation value / format error
Bildschirmfoto 2023-10-27 um 11 12 29

Bildschirmfoto 2023-10-27 um 11 01 48

it gets logged, too:

1.6983973229052248e+09	ERROR	controllers.Notebook	Notebook image change trigger annotation image.openshift.io/triggers JSON array decode error, check for correct annotation value JSON format : [{"from":{"kind":"ImageStreamTag","name":"code-server-notebook:2023c", "namespace":"opendatahub"},"fieldPath":spec.template.spec.containers[?(@.name==\"visualstudio\")].image"}]	{"notebook": "datascienceproject/visualstudiowrongannotation", "error": "invalid character 's' looking for beginning of value"}
github.com/kubeflow/kubeflow/components/notebook-controller/controllers.getImageChangeTriggerReferencedContainerNames
	/workspace/notebook-controller/controllers/notebook_controller.go:93
github.com/kubeflow/kubeflow/components/notebook-controller/controllers.(*NotebookReconciler).Reconcile
	/workspace/notebook-controller/controllers/notebook_controller.go:210
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:114
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:311
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:227

so that is ok too

…lugin_exclude_image_field_from_reconciliation
@shalberd
Copy link
Author

shalberd commented Jun 4, 2024

@harshad16 @jiridanek @jstourac @atheo89 @lucferbux

notebook container image field value, as only updated in StatefulSet CRD, does not need to be excluded from reconciliation Notebook to StatefulSet anymore. We decided not to use the image change trigger annotation mechanism.

closing this PR here and the dashboard PR 800 in favor of an alternative approach: odh odh notebook controller doing the notebook image field lookup in all cases directly in notebook image field spec (no openshift internal registry, openshift internal registry): #336 and #329 based on imagestream and tag status fields.

as well as dashboard changes which remove dependency on internal registry, supporting both use cases internal and external registry opendatahub-io/odh-dashboard#2867

Thank you all very much, neat work

minor remaining: changing notebook container imagePullPolicy: IfNotPresent from old imagePullPolicy: Always in a notebook CR podspec notebook container

as a nice-to-have feature for external registry and large images with a unique hash. Needs to be validated if that works with internal openshift registry, too. Saves times on workbench startup.

@shalberd shalberd closed this Jun 4, 2024
@atheo89
Copy link
Member

atheo89 commented Jun 5, 2024

Thank you, Sven! Your ideas really helped us out and pointed us in a better direction. We appreciate the effort you put in and the fresh perspective you brought to the project.

We will definitely consider your suggestions for upcoming releases. 🙂

atheo89 pushed a commit to atheo89/kubeflow that referenced this pull request Nov 7, 2024
…te-renovate

Update renovate.json for 2.16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants