Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow proxy to forward requests to namespaces outside of workspace #471

Merged

Conversation

alexeykazakov
Copy link
Contributor

@alexeykazakov alexeykazakov commented Sep 30, 2024

Currently our proxy blocks any requests to namespaces which are outside of the target workspace. And because of this we can't use the proxy to target namespaces which do not belong to any workspace. Like some shared namespaces with resources shared with all authenticated users.

Our proxy do not have to enforce any additional RBAC besides what is already enforced by the target API Server.
The proxy goals are basically the following:

  • Determine what target cluster (API server) to forward the request to
  • What user to impersonate

The rest will be taken care by the API Server. If the user does not have permissions to access the namespace then the request will be rejected by the API Server. The Proxy should not be in the RBAC business unless there is a very good reason for this.

See this thread for more details #443 (comment)

Paired with codeready-toolchain/toolchain-e2e#1053

Copy link
Contributor

@mfrancisc mfrancisc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me 👍
I think initially it was there for offloading the API server from the requests which can validated on our side.

I have only two comments/thoughts:

  1. I'm almost sure this doesn't break anything on the UI side, but maybe worth double checking

  2. Do we need to keep all the subtests in the proxy_communit_tests.go.?
    I mean these tests looks like they are testing the same path to me:

  • plain http request as owner to namespace outside of private workspace
  • plain http request as owner to namespace outside of community workspace
  • plain http request as community user to namespace outside of community workspace

it's basically checking that a user that has a valid usersignup and tries to access a namespace which is not in a community workspace or in the home workspace, which according to the new changes it is not being blocked at the proxy level anymore.

So maybe we can have just one "generic" test for verifying that requests to invalid namespaces are proxied?

@alexeykazakov
Copy link
Contributor Author

I'm almost sure this doesn't break anything on the UI side, but maybe worth double checking

This PR gets rid of some restrictions. So I would be very surprised if it broke something on the UI side. But you never know :)

Do we need to keep all the subtests in the proxy_communit_tests.go.?
it's basically checking that a user that has a valid usersignup and tries to access a namespace which is not in a community workspace or in the home workspace, which according to the new changes it is not being blocked at the proxy level anymore.

I thought about that. It checks all possible combinations (as you mentioned) that the outside-of-workspace namespace is still available. And if we consider it as a feature (that such namespaces are accessible) then I think it's good that we have all the possible cases covered.

@alexeykazakov
Copy link
Contributor Author

Paired with codeready-toolchain/toolchain-e2e#1053 now

/retest

@alexeykazakov
Copy link
Contributor Author

/retest

Copy link

codecov bot commented Oct 2, 2024

Codecov Report

Attention: Patch coverage is 33.33333% with 2 lines in your changes missing coverage. Please review.

Project coverage is 85.16%. Comparing base (3aaf51c) to head (1844bef).
Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
pkg/proxy/proxy.go 33.33% 0 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #471      +/-   ##
==========================================
- Coverage   85.93%   85.16%   -0.77%     
==========================================
  Files          32       32              
  Lines        2488     3175     +687     
==========================================
+ Hits         2138     2704     +566     
- Misses        265      384     +119     
- Partials       85       87       +2     
Flag Coverage Δ
unittests 85.16% <33.33%> (-0.77%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@MatousJobanek
Copy link
Contributor

I agree with @mfrancisc on that - the tests look like a duplication of the same. Don't block access to a different namespace outside of the selected workspace, I'm not sure if we really need to distinguish between different cases of the workspace.
But TBH, the whole TestProxySuite is unmaintainable beast that executes more than 13k of test combinations. The number can be fine, but the problem is that when something fails there, it's extremely hard to

  • figure out which line failed and for which combination
  • debug it
  • find the root cause

I can feel the pain now as I got lost in the tests many times as part of the refactoring efforts.

This is caused by the way the test is written. The standard "testing" library doesn't nicely work with such combinations of the cases the test is executed with. Gomega does that a bit better, but in such a scale, it would be still very chaotic.
That being said, the whole TestProxySuite requires a deep refactoring anyway.

Copy link
Contributor

@MatousJobanek MatousJobanek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks OKish 👍 Thanks a lot for dropping the limitation. The tests could be simplified, but as I mentioned in the other comment, it will require a complete refactoring anyway, and this doesn't make the tests much worse.

@MatousJobanek
Copy link
Contributor

And BTW, as soon as it gets merged, it would be nice to advertise it to Konflux as they faced some issues with it before.

@mfrancisc
Copy link
Contributor

TBH I don't think we need all these combination of the tests, now that the behavior is for the proxy to foward the request to Kube API, it was more useful IMO when we needed/wanted to ensure that the requests are denied at the proxy level. However it's a minor since it's better to have more tests and less :) , and also the tests need to be refactored as @MatousJobanek . So let's maybe to that as part of a major rework of the tests 👍

Copy link
Contributor

@mfrancisc mfrancisc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great Job 👍

@filariow
Copy link
Contributor

filariow commented Oct 2, 2024

I'm still a bit concerned about removing this constraint. I feel provisionedNamespaces work well as workspace boundaries, meaning that users shouldn't be able to perform cross-workspaces requests.

I'm concerned that in an environment with a lot of shared workspaces, human errors or wrong configurations won't be warned nor blocked and, as a result, users could potentially harm any namespace they have access to by applying the wrong manifest. In this case debugging will be very hard.

In a multi-members setup, this would be even more puzzling to me; the user will be able to perform cross-workspace requests for different subsets of workspaces -impacting the proxy's transparency property.

because of this we can't use the proxy to target namespaces which do not belong to any workspace. Like some shared namespaces with resources shared with all authenticated users.

What about using the public-viewer feature for this requirement? We could create a shared workspace with these resources and flag it as community. Everyone is going to have access to the resources, we will just need to add the SpaceRole and SpaceRoleBinding for the kubesaw-authenticated user.

Alternatively, I'd propose to introduce a feature flag in ToolchainConfig to enable/disable this constraint instead of completely remove it.

@MatousJobanek
Copy link
Contributor

MatousJobanek commented Oct 2, 2024

I'm still a bit concerned about removing this constraint. I feel provisionedNamespaces work well as workspace boundaries, meaning that users shouldn't be able to perform cross-workspaces requests.

The requests can be, but doesn't have to be cross-workspace. The cluster usually contains more namespaces than those that are managed by KubeSaw.
Sure, users can technically try to access different workspaces, but this is not a problem if the user has already access to it, right? It all comes down to RBAC in the cluster. If the workspace has been shared with the user already, then the user already has permissions in there, so it doesn't bring any security risk. If it hasn't been shared, then the request will fail with an unauthorized error.
Let's try to keep the proxy simple - forward the request to the member cluster the workspace was provisioned in. The only check that is worth doing is the verification if the user has at least some access to the workspace (user is provisioned and there is corresponding SpaceBinding), just to be sure that it makes sense to try to forward the request to the member cluster. Let's delegate the rest of the authorization validations to RBAC.

I'm concerned that in an environment with a lot of shared workspaces, human errors or wrong configurations won't be warned nor blocked and, as a result, users could potentially harm any namespace they have access to by applying the wrong manifest. In this case debugging will be very hard.

Let's not try to fix or minimize human mistakes at the level of KubeSaw & Proxy. Let's only focus on authorization/authentication validations. Using a wrong kubeconfig, session, or a kubeconfig context are common mistakes you cannot address at the level of the proxy.

What about using the public-viewer feature for this requirement? We could create a shared workspace with these resources and flag it as community. Everyone is going to have access to the resources, we will just need to add the SpaceRole and SpaceRoleBinding for the kubesaw-authenticated user.

As I mentioned above, there are namespaces not managed by KubeSaw, and these namespaces should stay like that. Exposing some resources to the entire cluster via RBAC is pretty common thing in some operators (even though I personally don't agree with that approach). This is what Alexey meant by "shared namespaces"

@alexeykazakov
Copy link
Contributor Author

alexeykazakov commented Oct 2, 2024

Fully agree with @MatousJobanek on the proxy topic. I truly believe we should stick with our proxy providing the proxy functionality as much as possible. And not making it a RBAC enforcer. Let's leave RBAC to Kube. If RBAC is messed up on the cluster side it's not the Proxy business to fix it.

As I mentioned in the description our proxy should focus on:

  • Determine what target cluster (API server) to forward the request to
  • What user to impersonate

That's it.
It's important to make sure that proxy doesn't introduce any backdoors which allow to bypass Kube RBAC though. But this PR doesn't do this.

@alexeykazakov
Copy link
Contributor Author

alexeykazakov commented Oct 2, 2024

And regarding testing. I agree that the proxy test need some love in general (not related to this PR). But since the underlying proxy logic with all these combinations of shared/private/etc workspaces is quite complex I still think it's safer to keep the tests as is (until we do a deeper refactoring) rather than just dropping them all together or leaving only one combination in place.

@filariow
Copy link
Contributor

filariow commented Oct 2, 2024

Sure, users can technically try to access different workspaces, but this is not a problem if the user has already access to it, right?

From a UX point of view this behavior really feels strange to me.
As a user that's explicitly targeting one workspace, I'd be surprised if -just by changing the namespace- I can manipulate resources from all the namespaces I have access to in the same cluster where the targeted workspace is allocated.

Let's not try to fix or minimize human mistakes at the level of KubeSaw & Proxy. Let's only focus on authorization/authentication validations.

I agree, we shouldn't try to minimize human mistakes in KubeSaw and I agree we should defer the most of responsibilities to cluster's RBAC. However, dropping this workspace property doesn't feel right to me.
These changes also have an impact the proxy's property of transparency in a multi-cluster setup (users do not need to understand which cluster is hosting the namespace).
That would be nice to find a way of implementing this check outside the Proxy, but I can't think of any ATM.

Using a wrong kubeconfig, session, or a kubeconfig context are common mistakes

To me that's the most important reason for not dropping this constraint on Workspaces

As I mentioned above, there are namespaces not managed by KubeSaw, and these namespaces should stay like that.

Thanks for clarifying. If that's the case the public-viewer idea won't work. Maybe we can add a list of passthroughNamespaces for which the request should always be proxied? This would add even more "RBAC logic" into the Proxy, though.
Or else, we could check that the targeted namespace (in the cluster of the requested Workspace) is not provisioned by another Workspace. If it is, we can return a Bad Request/NotFound error; if it's not, we forward it and rely on cluster's RBAC. WDYT?

I truly believe we should stick with our proxy providing the proxy functionality as much as possible. And not making it a RBAC enforcer. Let's leave RBAC to Kube. If RBAC is messed up on the cluster side it's not the Proxy business to fix it.

I don't think it's an RBAC enforcer as it is implemented now, nor that it would be a question of RBAC messed up. The concept of Workspace is not taken in consideration by k8s RBAC, so enforcing Workspace boundaries via RBAC is not doable.
To me the discussion here is more about relaxing the "Workspaces boundaries property" we have right now or not.

My point is that, even though I'm not expecting these changes to have a catastrophic impact, I see they may be harmful in some scenarios.

@alexeykazakov
Copy link
Contributor Author

alexeykazakov commented Oct 2, 2024

I don't think it's an RBAC enforcer as it is implemented now, nor that it would be a question of RBAC messed up. The concept of Workspace is not taken in consideration by k8s RBAC, so enforcing Workspace boundaries via RBAC is not doable.

Currently we have a real problem. You can't access a namespace which doesn't belong to any workspace via proxy. It's not just a theoretical or potential problem. A few users already complained about this. IMO it's a significant proxy limitation.

Is the workspace context concept ideal? No, it's not. And I agree that providing a workspace context to access the namespace from another workspace or a non-workspace namespace does look weird, IMO, it's still lesser of two evils.
Basically we are trading a foolproof mechanism which force users to be more careful with workspace contexts for a significant functional improvement.

Let's get rid of this significant limitation now and work on how we could improve UX of accessing such namespaces later since it doesn't look like a trivial task.

@alexeykazakov
Copy link
Contributor Author

/retest

pkg/proxy/proxy.go Outdated Show resolved Hide resolved
Copy link
Contributor

@ranakan19 ranakan19 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks Good
Left minor comments.
So there is no scenario where the ExpectedProxyResponseStatus is not OK, except for incorrect setup of proxy. Or maybe that there is no use case in proxy now where the expected response is forbidden, is that understanding correct?

// It's not up to the proxy to check permissions on the specific namespace.
// The target API server will reject the request if the user does not have permissions to access the namespace.
// Here the request is successful because the underlying mock target cluster API server returns OK
"plain http request as owner to namespace outside of private workspace": {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused as to what does 'owner' refer to here in the test description. To me, there is not much difference in the two statements below, wrt to the test:
plain http request as owner to namespace outside of private workspace v/s
plain http request to namespace outside of private workspace.
So either I'm missing the significance of owner here and we should add it here (maybe in the comments?) or we can make the test case description simpler and remove owner.
Let me know what you think

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the comments preceeding it explain it pretty well. But yeah the 'owner' thing in the test description threw me off a bit as well.

What about plain http request as permitted user to namespace outside of private workspace? Is that accurate?

Copy link
Contributor Author

@alexeykazakov alexeykazakov Oct 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced owner by permitted user in c557b41

pkg/proxy/proxy_community_test.go Outdated Show resolved Hide resolved
Copy link

openshift-ci bot commented Oct 4, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alexeykazakov, MatousJobanek, mfrancisc, rajivnathan, ranakan19

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [MatousJobanek,alexeykazakov,mfrancisc,rajivnathan,ranakan19]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link

sonarcloud bot commented Oct 29, 2024

@alexeykazakov alexeykazakov merged commit 795ca54 into codeready-toolchain:master Oct 29, 2024
9 of 11 checks passed
@alexeykazakov alexeykazakov deleted the proxy-namespace branch October 29, 2024 23:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants