-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add row-level security filter in query #17564
base: master
Are you sure you want to change the base?
Conversation
public class RestrictedDataSourceTest | ||
{ | ||
@Rule | ||
public ExpectedException expectedException = ExpectedException.none(); |
Check notice
Code scanning / CodeQL
Deprecated method or constructor invocation Note test
ExpectedException.none
…it as the default interface for checking permission
resourceAction, | ||
authorizerMapper | ||
); | ||
if (!authResult.isAllowed()) { | ||
|
||
authResult.getPermissionErrorMessage(true).ifPresent(error -> { |
Check failure
Code scanning / CodeQL
User-controlled bypass of sensitive method High
this condition
user-controlled value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This review includes some high level design comments.
I haven't checked the tests yet (will do so in a follow up review). The things I'd be looking for in the tests are:
- we should have tests for the resources, including negative tests for resources (such as the Dart resource) that don't support policies yet. The negative tests should verify that we get an error like "this endpoint doesn't support policies".
- we should also have tests for the lower level pieces like
QueryLifecycle
, theDataSource
mapping, andRestrictedSegment
.
server/src/main/java/org/apache/druid/server/security/AuthorizationResult.java
Outdated
Show resolved
Hide resolved
* @return authorization result | ||
*/ | ||
public Access authorize(HttpServletRequest req) | ||
public AuthorizationResult authorize(HttpServletRequest req) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because AuthorizationResult
includes policies, this method signature makes it ambiguous as to whether the caller should apply policies.
In this case, the QueryLifecycle
itself applies the policies, and the caller therefore doesn't need to. We should make that clear somehow. Javadoc could do it, or perhaps returning something other than AuthorizationResult
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the javadoc here to indicate query/datasource is transformed here
import java.util.Set; | ||
|
||
/** | ||
* Static utility functions for performing authorization checks. | ||
*/ | ||
public class AuthorizationUtils | ||
{ | ||
static final ImmutableSet<String> RESTRICTION_APPLICABLE_RESOURCE_TYPES = ImmutableSet.of( | ||
ResourceType.DATASOURCE, | ||
ResourceType.VIEW |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, we should do datasource-only for now, since VIEW
would be another bundle of stuff to think about: views are resolved in the SQL planner, so the restrictions would need to be applied in a different place.
This does bring up a question about the model though. If a user has restricted access to a DATASOURCE
, should those restrictions apply when the datasource is accessed through a SQL view? My stance is "yes" and I think the way we're doing it will achieve that. Please include some tests for this just to be sure it works as expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually that's opposite of the status quo, see the restrictedView
calcite tests. the view is created on forbiddenDatasource
which users don't have access to, but they can query the restrictedView.
|
||
private final boolean allowed; | ||
private final String message; | ||
// A row-level policy filter on top of table-level read access. It should be empty if there are no policy restrictions | ||
// or if access is requested for an action other than reading the table. | ||
private final Optional<DimFilter> rowFilter; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like it would be good to keep the value as some more general class, like Policies
rather than Optional<DimFilter>
, so when we want to add other kinds of policies (such as column restrictions, possibly) they can fit right in without further changes to the Access
object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a Policy class, plz review
server/src/main/java/org/apache/druid/server/security/Access.java
Outdated
Show resolved
Hide resolved
processing/src/main/java/org/apache/druid/query/RestrictedDataSource.java
Outdated
Show resolved
Hide resolved
processing/src/main/java/org/apache/druid/query/RestrictedDataSource.java
Outdated
Show resolved
Hide resolved
* @param rowFilters a mapping of table names to row filters, every table in the datasource tree must have an entry | ||
* @return the updated datasource, with restrictions applied in the datasource tree | ||
*/ | ||
default DataSource mapWithRestriction(Map<String, Optional<DimFilter>> rowFilters, boolean enableStrictPolicyCheck) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should design the strict check a bit differently. With this current design, a single literally TRUE filter would pass, but we don't want that. I think it would be better for this to not take enableStrictPolicyCheck
, but instead for the strict check to happen in QueryLifecycle
after the query is mapped. That would enable the check to be even stricter: it should really check not just that there is a filter, but also that the filter is actually doing something. To allow for the druid_internal
or admin
case, we can bypass the strict check if the user has permission for STATE READ
(a broad administrative permission).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To allow for the druid_internal or admin case, we can bypass the strict check if the user has permission for STATE READ (a broad administrative permission).
Actually upon further reflection this seems too complex. We don't want to have to consider both policies and STATE permissions. Instead, let's introduce a Policy
that is of type admin
. It doesn't apply any restrictions, but it's something an authorizer can return to signify that the user is OK to query unrestricted.
Btw, the strict check in QueryLifecycle
would need to happen even if the authorized returns ALLOW
. (Strict check should fail in this case)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes strict check should happen. btw, i added this flag because some tests in views are failing, which i assume won't be an easy solution to fix.
in theory if there's no views, we could default to strict check, it just wants to see table has an entry in policyMap (could Optional.empty() if authorizer says there's no policy).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see. The strict check should be even stricter: there should be a mode that requires all authorization results to have some non-empty set of policies. The idea with that check is it's a defense against the authorizer being mis-configured in such a way that policies aren't being reported properly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally if we have a PolicyConfig class similar to AuthConfig, or put it in a policy context, maybe could be more flexible.
* Returns an updated datasource based on the policy restrictions on tables. If this datasource contains no table, no | ||
* changes should occur. | ||
* | ||
* @param rowFilters a mapping of table names to row filters, every table in the datasource tree must have an entry |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should take Map<String, Policies>
instead, so other types of policies can be applied in the future without changing the DataSource
interface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Takes Map<String, Optional> now, since policy is optional for druid tables, ppl can config any (or all) tables to be policy restricted. It's a single policy, since policy is returned from authorizer and it's a merged result of (policy rule, or policy template, which could be a serialized format and supports crud and stuff).
|
||
import javax.annotation.Nullable; | ||
|
||
public class RestrictedSegment extends WrappedSegmentReference |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please include javadoc describing the purpose of this class, and how its defensive mechanisms work.
The way they should work is something like: that if you call asCursorFactory
or as(CursorFactory.class)
(plus perhaps some other small list), restrictions are handled automatically. But if you call asQueryableIndex
or as
for something other than that small list, the query gets the unrestricted internal object and it needs to call some method on the RestrictedSegment
confirming that it applied the restrictions on its own.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
needs to call some method on the RestrictedSegment confirming that it applied the restrictions on its own
To make this robust, we should actually have a 1-1 relationship between as
calls and "i know what i'm doing" calls. If a query engine calls as
three different times it should call "i know what i'm doing" three times.
…AuthorizationResult class, dart sql, msq sql, fix bug, added restricted data source to calcite test data
…rityLevel enum in Policy class, updated a bunch of tests
…internal won't be restricted
if (!(base instanceof TableDataSource)) { | ||
throw new IAE("Expected a TableDataSource, got [%s]", base.getClass()); | ||
} | ||
if (Objects.isNull(policy)) { | ||
throw new IAE("Policy can't be null for RestrictedDataSource"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Passer-by nits:
- It is better to use DruidExceptions instead of IAE/ISE. That would inform the readers as to whom the error message is intended for. It also guides the wording of the error message. Currently, it is not actionable. It is fine if the check is defensive, but if it is a user-facing error it would be nice to word it in a way that's easier for users to understand/take action upon.
- The interpolations
[]
can be in a different manner, where the exception is still understandable even if the interpolated phrases are removed, like: "Incorrect datasource [%s] received, expected a TableDataSource"
This PR adds the ability to attach row filters to a query, thus restrict row-level data access for given users.
Description
A query follows these steps: initialize -> authorize -> execute. In the authorize step, the permissions are checked for all the required resources in the query. Before this PR, the authorize step only returns allow or deny access on a table. Granting access to a table means a user can see all data in this table. After this PR, the authorize step can return allow access along with restrictions (i.e. a row filter that must be applied to the table ), which restrict users' data access at row level. For example, customers can only see rows relevant to their company.
The
authorizeAllResourceActions
now returns aAuthorizationResult
instead ofAccess
, this class also replacesDruidPlanner.AuthResult
class. The main difference betweenAuthorizationResult
andAccess
is that the former contains a map of table withDimFilter
. It can also haveResourceAction
Iterables which DruidPlanner cares about.In the authorize step of
QueryLifecycle
, it would enforce the filters on tables in the datasource tree, transformTableDataSource
toRestrictedDataSource
. In the execute step, filters are applied throughRestrictedSegment
andRestrictedCursorFactory
.Key changed/added classes in this PR
AuthorizationResult
. The class should be used for all the authorization calls, while theAccess
class is still used inAuthorizer
interface. It has an static variableALLOW_ALL
, which should be used for all internal calls.getPermissionErrorMessage(boolean policyFilterNotPermitted) is called to get a failure message, which replaced
access.toString(),
access.toMessage(),
access.getMessage()`. The class contains:Access
. AddedOptional<DimFilter> rowFilter
field, which represents a restrictions returned fromauthorizer
. Also updated constructor.AbstractStatement
. ReplaceDruidPlanner.AuthResult
withAuthorizationResult
.AuthConfig
. Added flag enableStrictPolicyCheck, when enabled, it would check every table needs to have a restriction in place, meaning it has an entry in the restrictions map, could be Optional.empty().AuthorizationUtils
. It now consolidates all restrictions for authorizing resource actions into a restrictions map, which is included inAuthorizationResult
. Also updated javadoc for all public methods.RestrictedDataSource
, which wraps aTableDataSource
with a DimFilter. If the filter is null, meaning there's no applied.RestrictedSegment
, which represents a segment with a filter.RestrictedCursorFactory
, can be created byRestrictedSegment.asCursorFactory
, enforces the DimFilter onCursor
.DataSource
interface, added a sub type ofrestrict
, added a default methodmapWithRestriction
.TableDataSource
, added the impl ofmapWithRestriction
.JoinDataSource
can acceptRestrictedDataSource
as left-hand side datasource.Query
interfaced, added a default methodwithPolicyRestrictions
.SegmentMetadataQuery
, added the impl ofwithDataSource
.QueryLifeCycle
, replacebaseQuery
withbaseQuery.withPolicyRestrictions
if authorization result is notALLOW_ALL
(calls from internal services).Caveats
UnionDataSource
doesn't work withRestrictedDataSource
, planning to fix that later.DartQueryMaker
andMSQTaskQueryMaker
, for now they would throw an error if there's any policy restrictions.This PR has: