-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Image change trigger plugin exclude image field from reconciliation #133
Conversation
Hi @shalberd. Thanks for your PR. I'm waiting for a opendatahub-io member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@harshad16 copy / cherry-pick of changes from v1.6 branch in my fork, based on PR-72, this time pull request to v1.7 |
/ok-to-test |
Hi @shalberd, thanks for opening this PR against I reviewed the functionality of this PR using the I found the following insights: When you update the tag into PodSec
The annotation on the CR isn't reconciling properly and still shows the old one.
The same for the
However, into notebook-controller logs, you will see that the trigger identifies the update:
So, I think something is missing on the 800 PR when you update the CR. Another thing that I found is that when you trigger a notebook via Data Science Project there is no existence of the |
@atheo89 tested behavior with Notebook CR assembled by dashboard PR 800, notice the image trigger annotation pointing to imagestream name and tag and the notebook container as target / fieldPath. Regarding dashboard PR 800, notice the env var JUPYTER_IMAGE now contains Also notice that in the Notebook CR, the image-field still just contains a placeholder,
be be filled in later in the StatefulSet generated by kubeflow notebook controller from Notebook CR by the help of the Openshift image change trigger plugin.
Generated StatefulSet, this is where it gets interesting:
Notice how the image-field gets auto-completed and replaced with the value of tag.from.name of that imagestream tag:
This is because I do not have an internal openshift registry on that cluster. If there were an internal openshift registry, the value put in there is either the same if spec.tags[tagname].referencePolicy.type is Source or the internal registry location if spec.tags[tagname].referencePolicy.type were Local, see main body of this PR. what is important to realize: the kubeflow notebook controller makes a statefulset based on notebook CR. But the image field lookup only happens at level StatefulSet. And the reconcilation exclude logic for all containers mentioned in the image trigger annotation prevents that the real image-value in the StatefulSet |
@harshad16 PTAL |
notebook-controller-deployment manager container log showing that in statefulset reconciliation, image field of container name jupyter-nb-sven is excluded from reconciliation with original notebook image field value. This is really in essence what this PR is about: when such an image change trigger annotation is present, the logic ensures that image field replacement by openshift mechanism stays current, regardless of what the initial image field value of container in notebook CR was. No more overwriting the change on reconciliation.
|
@atheo89 I am looking at your observations, thank you. Can you post your StatefulSet, similar to what I did? Your input is valuable because you have the internal openshift registry enabled, which I do not. Ah, I see what you are doing, you mean the scenario of a notebook tag change, ok.
If you want to refer to a new imagestream tag, change that in the Notebook CR annotation, not in the image field of the container directly and save the notebook CR. The annotation is the source of truth basically, not the image-field. The annotation tells openshift image trigger change plugin to fill-in the correct container image field value based on the imagestream name and tag in the annotation. Let me have a look at dashboard PR 800 code, could be it's not up to date with main dashboard branch. My pod jupyter-nb-sven is running fine and the notebook started from dashboard. I will have a look at Data Science Projects Worbench, so far only created a Notebook CR from Jupyter tile traditional outside DSP. |
updating PR-800 branch with latest code from odh-dashboard the old image pr-800 is too old I am trying again once it is ready at https://quay.io/repository/opendatahub/odh-dashboard?tab=tags @atheo89 25 July 23:45 CET: new odh-dashboard image built Digest: sha256:2c08e88759e7a4ed9a66e0d65a60025d0dee5eddd282c6827fe37b4ea73f7be0 With that latest build of odh-dashboard, I have also been able to create a workbench and run it in Data Science Project namespace. The Notebook CR and related StatefulSet was created correctly, too. I am now trying to update an existing workbench in DSP to a new imagestream tag, checking if it updates the annotation in the Notebook CR correctly. Edit Workbench is what that is called, I think. |
Confirmed working: When I edit a data science projects workbench, selecting a different tag for a given imagestream (could also have been an entirely different imagestream name and tag), the Notebook CR gets updated in the annotation with the new imagestream tag, the StatefulSet CR image change trigger annotation, too, and the image-field of the notebook container gets populated with the correct new value for the new imagestream tag. notebook controller log in notebook-controller-deployment:
So to summarize, I see |
It'd be interesting to see how this combo behaves when on an environment with the internal openshift docker registry, as I think all your environments are. Also, I have not tested this yet with what you all call Bring Your Own Notebook. As long as a valid imagestream is behind that BYON, that, too, should be ok. |
Thank you for making it work with DSP as well. I can confirm that I now see the annotation on the DSP statefulset. However, today my cluster is experiencing issues, and I'm unable to spawn a notebook via the Jupyter tile. Yesterday, I checked, and it was working as expected. You can refer to my comment here for more details. |
@atheo89 confims that on an environment with openshift internal registry, for an imagestream tag whose referencePolicy is set to Local, the openshift registry location is filled in the container image field, very good to see that working as described above in the main body of the PR |
This PR works together with the changes on the dashboard side introduced here -> opendatahub-io/odh-dashboard#800 |
Hey @shalberd, would you mind squashing the commits, please? It would be a great help! Thanks a bunch! 😊 |
797172e
to
fe7f6e6
Compare
Hi @atheo89 yup, just thought of the same myself, down to two commits, with meaningful messages in themselves. Good point, thank you. Additional point regarding the built docker images and referencing them in manifests: |
/lgtm |
Hi @harshad16 can you do a final review, please? See Adrianas and my history above. Dashboard team cannot proceed until this is formally approved. Thank you. |
fe7f6e6
to
c19a109
Compare
New changes are detected. LGTM label has been removed. |
ee0cfd5
to
5144d1b
Compare
@harshad16 unit tests working again, e2e, too, all jobs succeeded, thank you for your deep dive into openshift CI. |
252f06a
to
10e7c77
Compare
components/notebook-controller/controllers/notebook_controller.go
Outdated
Show resolved
Hide resolved
fdce016
to
97f256e
Compare
…tatefulset if present in notebook, to exclude image-fields from compare, to keep image-field to new value during reconcile. use our version of kubeflow/components/common module for reconcilehelper utility differences related to image-field reconciliation and affected containers exclude
784bf70
to
d02cf71
Compare
latest image working. Test setup manifests with built images described here: opendatahub-io/odh-dashboard#800 (comment)
if I manually create a notebook CR with wrong json annotation value / format error it gets logged, too:
so that is ok too |
…lugin_exclude_image_field_from_reconciliation
@harshad16 @jiridanek @jstourac @atheo89 @lucferbux notebook container image field value, as only updated in StatefulSet CRD, does not need to be excluded from reconciliation Notebook to StatefulSet anymore. We decided not to use the image change trigger annotation mechanism. closing this PR here and the dashboard PR 800 in favor of an alternative approach: odh odh notebook controller doing the notebook image field lookup in all cases directly in notebook image field spec (no openshift internal registry, openshift internal registry): #336 and #329 based on imagestream and tag status fields. as well as dashboard changes which remove dependency on internal registry, supporting both use cases internal and external registry opendatahub-io/odh-dashboard#2867 Thank you all very much, neat work minor remaining: changing notebook container imagePullPolicy: IfNotPresent from old imagePullPolicy: Always in a notebook CR podspec notebook container as a nice-to-have feature for external registry and large images with a unique hash. Needs to be validated if that works with internal openshift registry, too. Saves times on workbench startup. |
Thank you, Sven! Your ideas really helped us out and pointed us in a better direction. We appreciate the effort you put in and the fresh perspective you brought to the project. We will definitely consider your suggestions for upcoming releases. 🙂 |
…te-renovate Update renovate.json for 2.16
fixes kubeflow/notebooks#98
to be tested in conjunction with ODH dashboard PR-800
Looking for feedback before thinking of merging. AFIK, this should be pretty close to being mergeable, though.
seems to build alright locally, except unclear to me how to update the common/components, where reconcilehelper lies.
Got an error that is probably easy to resolve and dependent on the build chain:
that Problem was resolved by referencing our version of the reconcilehelper utilities-module under /components/common in notebook controlller go.mod:
Key idea:
Updating notebook controller logic and reconcilehelper util logic to take into account optional Openshift image policy admission plug-in image-change trigger annotation with one or more containers and their related imagestreams (json array) that leads to container image-field update and to avoid overwriting the updated image-field value by the notebook controller.
Discussed problem with @bparees , received input from @VaishnaviHire, member of Open Data Hub (a RedHat project for Openshift) team. Also discussed how to resolve the issue in Notebook reconciler with Ben.
Context: Notebook yaml contains the (optional) metadata annotation
referencing imagestream name and tag, imagestream namespace, and fieldPath reference to the image-value to be replaced by the image policy admission plug-in for a given container name. This annotation can reference one or more containers, in the form of a json array. This image-name resolve by the image policy admission plug-in happens in basic Kubernetes objects, not in the Notebook yaml itself, but in the StatefulSet derived from the Notebook yaml. The resolved image-name is set briefly after application of the StatefulSet to the server.
Or, referencing the openshift documentation:
When one of the core Kubernetes resources contains both a pod template and this annotation, OpenShift Container Platform attempts to update the object by using the image currently associated with the image stream tag that is referenced by trigger. The update is performed against the fieldPath specified.
To ensure that image-fields referenced via this annotation in the StatefulSet are not overwritten by the kubeflow notebook controller reconcile process, we set image-fields-values before change and after change to be equal, thus leading to DeepEqual returning false even when the image-field value changes.
Because the annotation is optional, checks are introduced in the code to ensure any custom logic is only done when the annotation is present.
Background:
making image-field value independent of internal openshift registry and supporting ImageContentSourcePolicy for Openshift environments with a third-party Docker repo, like VMWare Harbor. This is archieved by using the imagestream abstraction. Referencing imagestream name and tag instead of image-digest-values directly. Imagestreams are kind of an abstraction in openshift.
See what happens to the image-field cause of the image policy admission plug-in doing its work:
<image stream name>:<image stream tag>
references in image-field or annotation of a Kubernetes object (Deployment, StatefulSet, Pod etc.) always resolve to external image reference from imagestream spec.tags[tagname].from.name regardless of whether spec.tags[tagname].referencePolicy.type is set to Local or Source in image stream--> perfect for air gapped and ImageContentSourcePolicy use or if one just wants to pull directly from the external location when there is no internal openshift registry.if spec.tags[tagname].referencePolicy.type of the imagestream is set to Source then
<image stream name>:<image stream tag>
references in image-field or annotation of a Kubernetes object (Deployment, StatefulSet, Pod etc.) resolves to the remote repo location:if spec.tags[tagname].referencePolicy.type of the imagestream is set to Local then
<image stream name>:<image stream tag>
references in image-field or annotation of a Kubernetes object (Deployment, StatefulSet, Pod etc.) resolves to the namespace-specific location of the image in the open shift-internal registry: