-
Notifications
You must be signed in to change notification settings - Fork 283
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize the Fls/Dls/FieldMasking data structure to only include the concrete indices from the current request #4706
Conversation
Signed-off-by: Craig Perkins <[email protected]>
… not fully resolve to concrete index for datastream Signed-off-by: Craig Perkins <[email protected]>
Signed-off-by: Craig Perkins <[email protected]>
Signed-off-by: Craig Perkins <[email protected]>
Signed-off-by: Craig Perkins <[email protected]>
Signed-off-by: Craig Perkins <[email protected]>
Signed-off-by: Craig Perkins <[email protected]>
Signed-off-by: Craig Perkins <[email protected]>
Signed-off-by: Craig Perkins <[email protected]>
Signed-off-by: Craig Perkins <[email protected]>
Signed-off-by: Craig Perkins <[email protected]>
@@ -309,7 +309,7 @@ private void resolveIndexPatterns( | |||
final boolean isDebugEnabled = log.isDebugEnabled(); | |||
try { | |||
matchingAllIndices = Arrays.asList( | |||
resolver.concreteIndexNames(state, indicesOptions, localRequestedPatterns.toArray(new String[0])) | |||
resolver.concreteIndexNames(state, indicesOptions, true, localRequestedPatterns.toArray(new String[0])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is needed to ensure that the concrete index corresponding to any data streams in the request is resolved to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we also want to set this as true in line 489/is there a reason this was not passed as true before?
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #4706 +/- ##
==========================================
+ Coverage 65.20% 65.56% +0.35%
==========================================
Files 318 319 +1
Lines 22327 22457 +130
Branches 3591 3602 +11
==========================================
+ Hits 14558 14723 +165
+ Misses 5971 5924 -47
- Partials 1798 1810 +12
|
Thanks for the changes @cwperks ! I think this should help reduce consumed memory per request. @nibix in #3870 has proposed a different data-structure (pre-computed) per user and hence this computation would not be required per request. Also, we need to discuss special handling for requests with all indices wildcard - Can you check #3870 and if DLS/FLS changes can be tested? Can we run some micro-benchmark to confirm which is better? |
I think the changes in this PR and the pre-computed data structure can both provide performance benefits to the FLS/DLS/Field Masking implementation. The primary gain from this PR is changing the data structure that is set in the ThreadContext headers here. Instead of passing the full data structure (which can contain indices that are not in the request) it sets only the portion of the Fls/Dls/Field Masking data structure that is pertinent to the request. If we can also avoid computing the full EvaluatedDlsFlsConfig on every request, then that would be another good performance boost. |
How does this affect requests that do not carry index information (e.g. the scroll request)? So far, I believed we computed permissions for all indices because there are a couple of operations where you cannot tell on the |
threadContext.getHeader(ConfigConstants.OPENDISTRO_SECURITY_FLS_FIELDS_HEADER), | ||
threadContext.getTransient(ConfigConstants.USE_JDK_SERIALIZATION) | ||
); | ||
if (!isSubMap(flsFields, deserializedMap)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is for clusters with mixed OpenSearch versions, right?
I think there's a catch: This will work in the case that a request first hits a node with an old OpenSearch version which then gets forwarded to a node with a newer OpenSearch version.
However, in the reverse case, there will be trouble: If a request first hits an OpenSearch version with this change, this node will calculate the filtered map and pass it on. If the request is then passed on to an older OpenSearch version, it will calculate a different map and throw an exception.
A different perspective: TBH, it is totally unclear to me what's actually the purpose of this equality check. So far, I could not think of any real purpose. I would have the tendency to say that this check could be removed alltogether. (Still, this would not fix the mixed cluster issue directly). That would also help performance as it avoids the back and forth serialization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a check for min node version in. If the cluster has nodes <= 2.17 it will place in the full map as expected. If the min version is 2.18 or beyond it will place the filtered map in as headers.
Signed-off-by: Craig Perkins <[email protected]>
I added a test with scroll, by the time that Here is some of the debugging info I captured from the test:
I'm currently looking more into when |
…fore Signed-off-by: Craig Perkins <[email protected]>
Signed-off-by: Craig Perkins <[email protected]>
Closing in favor of #4380 which introduces a new DLS implementation and will not rely on the current per-request Data Structure for FLS/DLS/Field Masking implementation. |
Description
This PR is an optimization for DLS/FLS/Field Masking implementation in the security plugin. Currently, the security plugin computes a data structure called EvaluatedDlsFlsConfig which is a mapping of all concrete indices -> Set dls/fls/fieldmasking rules that are applicable to the concrete index.
The keys in this map are all of the concrete indices that can be resolved to from roles mapped to the user. For instance, if a user is mapped to a role that has
index_patterns: '*'
, then this data structure will always include all concrete indices in the cluster on every request, regardless if the request is only on a small set of indices.This PR optimizes this to include filtering. It takes the existing data structure, but filters it down to the concrete indices from the request. So if a search request is on a single index, this data structure will have a single key with the Dls/Fls/Field Masking rules that apply to that single index.
Example of the data structure
Enhancement
Issues Resolved
Related to: #4031
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.