Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RHOAIENG-11155: Better explanation of 'Authorize Access' UI #449

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

daniellutz
Copy link

@daniellutz daniellutz commented Nov 14, 2024

This feature will improve the user experience in a way that the user required OAuth scope will change from a UI showing the scopes to a simple login confirmation page, according to https://issues.redhat.com/browse/RHOAIENG-11155

To test this PR, params.env have been changed to point to another generated image, so it will pick the correct content due to the changes made, so the test would be recommended to be with the devFlags instead of pointing directly to the PR generated image

Description

There is an option to inject the OAuth scope into the proxy sidecar container, in a way that it will be required only for the user to confirm his login to accept it, instead of showing up a page with confusing permissions and a bad user experience.

Not only the OAuth scope need to be passed on, but also a volume need to be mounted to gather the OAuth client secret, in a way that the application understands who is authenticating properly.

How Has This Been Tested?

Manual tests have been executed, using devFlags and with a clean test running in OpenShift Local environment.

  • quay.io/opendatahub/kubeflow-notebook-controller:pr-449
  • quay.io/opendatahub/odh-notebook-controller:pr-449

Would be helpful to provide the ready-to-use devFlags here.


Merge criteria:

  • The commits are squashed in a cohesive manner and have meaningful messages.
  • Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
  • The developer has manually tested the changes and verified that the changes work

@daniellutz daniellutz self-assigned this Nov 14, 2024
@openshift-ci openshift-ci bot requested review from caponetto and paulovmr November 14, 2024 00:16
@daniellutz daniellutz changed the title Better explanation of 'Authorize Access' UI RHOAIENG-11155: Better explanation of 'Authorize Access' UI Nov 14, 2024
@atheo89
Copy link
Member

atheo89 commented Nov 14, 2024

Thank for opening this PR!
Seems that in my end there is no oauth-client data in order to be mounted and start properly the oauth proxy.

Steps to reproduce:

  1. Replace odh-notebook-controller image with the one that generated from this PR.

  2. Create a new notebook

  3. Then I got:
    image

  4. Logs from oauth conteiner show:

2024/11/14 08:31:30 provider.go:120: Defaulting client-id to system:serviceaccount:test:q2
2024/11/14 08:31:30 provider.go:125: Defaulting client-secret to service account token /var/run/secrets/kubernetes.io/serviceaccount/token
2024/11/14 08:31:30 main.go:140: Invalid configuration:
  cannot read client-secret-file: open /etc/oauth/client/secret: no such file or directory
atheodor@fedora:~-$ oc get secrets
NAME                       TYPE                      DATA   AGE
builder-dockercfg-jkztm    kubernetes.io/dockercfg   1      7d20h
default-dockercfg-knsk6    kubernetes.io/dockercfg   1      7d20h
deployer-dockercfg-9ngd6   kubernetes.io/dockercfg   1      7d20h
pipeline-dockercfg-5fn4m   kubernetes.io/dockercfg   1      7d20h
q2-dockercfg-2x9gd         kubernetes.io/dockercfg   1      35m
q2-oauth-client            Opaque                    0      35m      <---- Problematic secret
q2-oauth-config            Opaque                    1      35m
q2-tls                     kubernetes.io/tls         2      35m

  1. Oauth-client didn't populated data secret and thats why failed

What i missed?

@@ -1 +1 @@
odh-notebook-controller-image=quay.io/opendatahub/odh-notebook-controller:main-3f931d2
odh-notebook-controller-image=quay.io/dlutz/odh-notebook-controller:authorize-access
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

testing purposes? where? what?

@openshift-merge-robot openshift-merge-robot added needs-rebase The PR needs a rebase or there are conflicts and removed needs-rebase The PR needs a rebase or there are conflicts labels Nov 14, 2024
VolumeSource: corev1.VolumeSource{
Secret: &corev1.SecretVolumeSource{
SecretName: Name + "-oauth-client-generated",
DefaultMode: pointer.Int32Ptr(420),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@daniellutz the idea here is that the linter only runs on new code, so that's why it is not flagging the existing usages of pointer.Int32Ptr. What you should do here is to simply use ptr.To instead of this, and it will work just fine. The idea is that ptr.To is a better replacement for the old functions, and it could not be used before because it requires go 1.18+ features https://pkg.go.dev/k8s.io/utils/ptr#section-readme

@jiridanek
Copy link
Member

jiridanek commented Nov 15, 2024

@daniellutz regarding ODH Notebook Controller Integration Test / build (pull_request) Failing after 14m

  Warning  FailedMount  43s                kubelet            Unable to attach or mount volumes: unmounted volumes=[oauth-client], unattached volumes=[oauth-client oauth-config tls-certificates kube-api-access-mn78k]: timed out waiting for the condition

That's never going to pass because in the test we are running in a KinD cluster (https://kind.sigs.k8s.io/) and we only have the notebook controller and no other components of rhoai, most importantly we don't have rhods-operator that would create your oauth secret for the pod; so the solution should be to create the secret in test setup. I'll take a look r.n.

edit: got it resolved; but when i copied actual random secret from my cluster, I got yelled at by prodsec code scanning tool that I am leaking secrets into github. need to get some example secret that will not trigger their bots

"notebook-name": notebook.Name,
},
Annotations: map[string]string{
"secret-generator.opendatahub.io/name": "secret",
Copy link
Member

@atheo89 atheo89 Nov 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: This specific annotations used by rhoai-operator to create a secret <secret-name>-generated

https://github.com/opendatahub-io/opendatahub-operator/tree/84d22f35a620f43f0e8b397b1b45e2bdb25a8f46/controllers/secretgenerator#basic-usage

Copy link
Member

@harshad16 harshad16 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the discussion with platform, team
we got to know, it would better if the logic of secret is on own side, so lets adjust that

@@ -209,6 +209,26 @@ func NewNotebookOAuthSecret(notebook *nbv1.Notebook) *corev1.Secret {
}
}

// NewNotebookOAuthClientSecret defines the desired OAuth client secret object
func NewNotebookOAuthClientSecret(notebook *nbv1.Notebook) *corev1.Secret {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we would not be utilizing, opendatahub-operator secretgenerator,
lets change the logic here, and write our own secret create

please take this as reference:

func NewNotebookOAuthSecret(notebook *nbv1.Notebook) *corev1.Secret {

and adjust this function , and for secret generation,
utilize the logic of random function from here: https://github.com/opendatahub-io/opendatahub-operator/blob/c1671ab5fd11baea814f8acdee1bc448d502fb1c/controllers/secretgenerator/secret.go#L91

Suggested change
func NewNotebookOAuthClientSecret(notebook *nbv1.Notebook) *corev1.Secret {
func NewNotebookOAuthClientSecret(notebook *nbv1.Notebook) *corev1.Secret {
// Generate the client secret for the OAuth proxy
randomValue := make([]byte, 32)
for i := 0; i < secret.Complexity; i++ {
num, err := rand.Int(rand.Reader, big.NewInt(int64(len(letterRunes))))
if err != nil {
return err
}
randomValue[i] = letterRunes[num.Int64()]
}
// Create a Kubernetes secret to store the cookie secret
return &corev1.Secret{
ObjectMeta: metav1.ObjectMeta{
Name: notebook.Name + "-oauth-client",
Namespace: notebook.Namespace,
Labels: map[string]string{
"notebook-name": notebook.Name,
},
},
StringData: map[string]string{
"secret": string(randomValue),
},
}

and adjust the oauth-proxy to directly pick value from this secret

- Add volumes to store the oauth-client configuration;
- Add extra parameters to container creation, including --client-id, --client-secret and --scope;
- Add method to generate the secret randomly when authenticating using OAuth;
- Add link between the notebook's route, user's generated oauth secret and OAuthClient config;
Copy link

openshift-ci bot commented Jan 13, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from harshad16. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@daniellutz
Copy link
Author

I've had to change the methods from what we were trying before, due to unplanned changes from other teams to namespace watching to the secret generation work.

Now, the script will get the notebook's route, automatically generate the secret, create the OAuthClient link between the route and the secret and enable the access without requesting permissions with the UI.

Reach out in any case of questions, suggestions, etc.

Copy link

openshift-ci bot commented Jan 13, 2025

@daniellutz: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/odh-notebook-controller-e2e 5ba8387 link true /test odh-notebook-controller-e2e

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@jiridanek
Copy link
Member

Looks like the test fails are legit

--- FAIL: TestE2ENotebookController (394.17s)
    --- PASS: TestE2ENotebookController/validate_controllers (20.45s)
        --- PASS: TestE2ENotebookController/validate_controllers/Validate_Kubeflow_notebook_controller (10.40s)
        --- PASS: TestE2ENotebookController/validate_controllers/Validate_ODH_notebook_controller (10.04s)
    --- FAIL: TestE2ENotebookController/create (282.30s)
        --- FAIL: TestE2ENotebookController/create/thoth-minimal-oauth-notebook (282.30s)
            --- PASS: TestE2ENotebookController/create/thoth-minimal-oauth-notebook/Creation_of_Notebook_instance (10.26s)
            --- PASS: TestE2ENotebookController/create/thoth-minimal-oauth-notebook/Notebook_Route_Validation (10.08s)
            --- PASS: TestE2ENotebookController/create/thoth-minimal-oauth-notebook/Notebook_Network_Policies_Validation (20.08s)
            --- PASS: TestE2ENotebookController/create/thoth-minimal-oauth-notebook/Notebook_Statefulset_Validation (10.05s)
            --- PASS: TestE2ENotebookController/create/thoth-minimal-oauth-notebook/Notebook_OAuth_sidecar_Validation (0.04s)
            --- FAIL: TestE2ENotebookController/create/thoth-minimal-oauth-notebook/Verify_Notebook_Traffic (10.67s)
            --- PASS: TestE2ENotebookController/create/thoth-minimal-oauth-notebook/Verify_Notebook_Culling (221.13s)
    --- FAIL: TestE2ENotebookController/update (61.08s)
        --- FAIL: TestE2ENotebookController/update/thoth-minimal-oauth-notebook (61.07s)
            --- PASS: TestE2ENotebookController/update/thoth-minimal-oauth-notebook/Update_Notebook_instance (10.18s)
            --- PASS: TestE2ENotebookController/update/thoth-minimal-oauth-notebook/Notebook_Route_Validation_After_Update (10.08s)
            --- PASS: TestE2ENotebookController/update/thoth-minimal-oauth-notebook/Notebook_Network_Policies_Validation_After_Update (20.08s)
            --- PASS: TestE2ENotebookController/update/thoth-minimal-oauth-notebook/Notebook_Statefulset_Validation_After_Update (10.04s)
            --- PASS: TestE2ENotebookController/update/thoth-minimal-oauth-notebook/Notebook_OAuth_sidecar_Validation_After_Update (0.04s)
            --- FAIL: TestE2ENotebookController/update/thoth-minimal-oauth-notebook/Verify_Notebook_Traffic_After_Update (10.64s)
    --- PASS: TestE2ENotebookController/delete (30.34s)
        --- PASS: TestE2ENotebookController/delete/thoth-minimal-oauth-notebook (30.34s)
            --- PASS: TestE2ENotebookController/delete/thoth-minimal-oauth-notebook/Notebook_Deletion (0.13s)
            --- PASS: TestE2ENotebookController/delete/thoth-minimal-oauth-notebook/Dependent_Resource_Deletion (30.21s)
FAIL
FAIL	github.com/opendatahub-io/kubeflow/components/odh-notebook-controller/e2e	394.196s

https://prow.ci.openshift.org/view/gs/test-platform-results/pr-logs/pull/opendatahub-io_kubeflow/449/pull-ci-opendatahub-io-kubeflow-main-odh-notebook-controller-e2e/1878640725080412160

This is running only on openshift-ci, that's why gha is passing.


// Create the OAuth secret if it does not already exist
foundSecret := &corev1.Secret{}
func (r *OpenshiftNotebookReconciler) createSecret(notebook *nbv1.Notebook, ctx context.Context, desiredSecret *corev1.Secret) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
func (r *OpenshiftNotebookReconciler) createSecret(notebook *nbv1.Notebook, ctx context.Context, desiredSecret *corev1.Secret) error {
func (r *OpenshiftNotebookReconciler) reconcileSecret(notebook *nbv1.Notebook, ctx context.Context, desiredSecret *corev1.Secret) error {

The function creates secret only if it does not already exist. So the name should reflect that.

func (r *OpenshiftNotebookReconciler) createOAuthClient(notebook *nbv1.Notebook, ctx context.Context) error {
log := logf.FromContext(ctx)

//
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do these empty comment lines mean anything? just asking

Copy link
Member

@jiridanek jiridanek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, the only thing that really should be fixed is the ocp-ci e2e test, but other than that it's good to go imo

@jiridanek
Copy link
Member

I'm trying the PR images

This is what happens with running notebook if I switch from regular controllers in 2.17 nightly to the pr ones

{"level":"info","ts":"2025-01-14T09:17:30Z","logger":"controllers.Notebook","msg":"Update blocked, notebook pod template would be changed by the webhook","notebook":"mywb2","namespace":"vscodedocs","diff":"{v1.PodSpec}.Volumes[5->?]: {Name:oauth-client VolumeSource:{HostPath:nil EmptyDir:nil GCEPersistentDisk:nil AWSElasticBlockStore:nil GitRepo:nil Secret:&SecretVolumeSource{SecretName:mywb2-oauth-client,Items:[]KeyToPath{},DefaultMode:*420,Optional:nil,} NFS:nil ISCSI:nil Glusterfs:nil PersistentVolumeClaim:nil RBD:nil FlexVolume:nil Cinder:nil CephFS:nil Flocker:nil DownwardAPI:nil FC:nil AzureFile:nil ConfigMap:nil VsphereVolume:nil Quobyte:nil AzureDisk:nil PhotonPersistentDisk:nil Projected:nil PortworxVolume:nil ScaleIO:nil StorageOS:nil CSI:nil Ephemeral:nil}} != <invalid reflect.Value>"}

not very understandable to read, but I guess it's refusing to mount the oauth config as a volume, which is correct, to prevent pod restart. And pod was not restarted, so that's good.

One more thing, what happens if i start the pod with old controller, then switch to new controller images, and only then I try to access the workbench (with all the oauth flow)?

And that did immediately start working for me, no oauth dialog. Even without restarting the workbench container, apparently, the oauth prompt did not appear! I immediately got thrown into Jupyter. I'm shocked, will try this again.

Starting and opening a brand new workbench gave me

{"error":"unauthorized_client","error_description":"The client is not authorized to request a token using this method.","state":"8539f027f67f47bfd5efbb1ebd30259f:/notebook/vscodedocs/noreg1"}

FAIL!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants