-
Notifications
You must be signed in to change notification settings - Fork 282
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use Collections.synchronizedSet and Collections.synchronizedMap for r… #1970
Use Collections.synchronizedSet and Collections.synchronizedMap for r… #1970
Conversation
…oles, securityRoles and attributes in User Signed-off-by: Craig Perkins <[email protected]>
Good find @cwperks !! |
Codecov Report
@@ Coverage Diff @@
## main #1970 +/- ##
============================================
+ Coverage 61.04% 61.05% +0.01%
- Complexity 3233 3234 +1
============================================
Files 256 256
Lines 18085 18085
Branches 3222 3222
============================================
+ Hits 11040 11042 +2
+ Misses 5470 5467 -3
- Partials 1575 1576 +1
Help us with your feedback. Take ten seconds to tell us how you rate us. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These kinds of bugs are the worst thank you for digging in on this issue.
As there was a provable (albeit complex) reproduction shouldn't we be able to create a test case? I spent some time trying to author a test that would create same state as the much more convoluted repro to no avail. Any other test that involves more systems running in concern - concurrently means that CI wouldn't be as reliable if its running at lower clock speed that dev boxes.
What about updating Base64Helper to ensure that collections objects are synchronized? We cannot easily do this with reflection as the classes do not implement a clear interface, and the definitions themselves are private, see Collections.java#L2005
I'd recommend we take the fix as is, if other folks have ideas on how we can test it I would be interested.
@peternied I'm willing to test on my prod cluster if you can provide build artifacts. |
@msoler8785 Thanks for the generous offer. Unfortunately due to version number alignment you can only install plugins with the same version number as OpenSearch. We don't have a clean way to produce a build pointing to an older number. The next scheduled release is OpenSearch 2.2.0 which will include this fix. According to the schedule there should be a viable built by Aug 4th if not sooner. At your own risk; For a more immediate deployment, one could build the security plugin by checking out the version number of this plugin that corresponds to the branch of your cluster, cherry-pick this change, |
@peternied thanks to your guidance I was able to get the plugin built and installed on all my cluster nodes. Initial testing looks good and I am no longer encountering those errors. I'll review over the weekend and update if anything changes. Thanks @cwperks! |
@opensearch-project/security I'd like to see this merged for |
I agree |
…oles, securityRoles and attributes in User (#1970) Signed-off-by: Craig Perkins <[email protected]> (cherry picked from commit 50a94b4)
…oles, securityRoles and attributes in User (#1970) (#1983) Signed-off-by: Craig Perkins <[email protected]> (cherry picked from commit 50a94b4) Co-authored-by: Craig Perkins <[email protected]>
…oles, securityRoles and attributes in User (opensearch-project#1970) Signed-off-by: Craig Perkins <[email protected]>
…oles, securityRoles and attributes in User (opensearch-project#1970) Signed-off-by: Craig Perkins <[email protected]>
…oles, securityRoles and attributes in User (opensearch-project#1970) Signed-off-by: Craig Perkins <[email protected]> Signed-off-by: Stephen Crawford <[email protected]>
…oles, securityRoles and attributes in User (opensearch-project#1970) (opensearch-project#1983) Signed-off-by: Craig Perkins <[email protected]> (cherry picked from commit 50a94b4) Co-authored-by: Craig Perkins <[email protected]>
Signed-off-by: Stephen Crawford <[email protected]>
#3725) ### Description This change backports #1970 in order to fix the OptionalDataException issue encountered on the 1.3.x line. The original change did not have any tests but #3637 added tests, so I backported those as well and the changes required to support them. This resulted in changes to the TestSecurityConfig.User class, the addition of a IndexStateIsEqualToMatcher and User.java. ### Issues Resolved - Resolves #3531 - Backports #1970 - Backports #3637 ### Check List - [X] New functionality includes testing - [ ] ~New functionality has been documented~ - [x] Commits are signed per the DCO using --signoff By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. For more information on following Developer Certificate of Origin and signing off your commits, please check [here](https://github.com/opensearch-project/OpenSearch/blob/main/CONTRIBUTING.md#developer-certificate-of-origin). --------- Signed-off-by: Stephen Crawford <[email protected]> Signed-off-by: Stephen Crawford <[email protected]> Signed-off-by: Peter Nied <[email protected]> Co-authored-by: Peter Nied <[email protected]>
…oles, securityRoles and attributes in User
Signed-off-by: Craig Perkins [email protected]
Description
Creating a draft PR to solicit feedback on a potential fix for #1961 and #1927.
Users are reporting that when the security cache expires that an intermittent
java.io.OptionalDataException
occurs.The error is thrown while trying to
readObject()
from aSafeObjectInputStream
while deserializinggetThreadContext().getHeader(ConfigConstants.OPENDISTRO_SECURITY_USER_HEADER)
.I got the idea for this fix from here and here. Since the error was occurring on deserialization of the User object, I turned the
HashSet
andHashMap
members ofUser
into their correspondingCollections.synchronizedSet
andCollections.synchronizedMap
respectively.Automated tests for this issue are being written.
Bug Fix
Old behavior:
java.io.OptionalDataException
exception is intermittently thrown on expiry ofplugins.security.cache.ttl_minutes
New behavior: No exceptions thrown, bulk api does not throw exception at the expiration of the security cache
Issues Resolved
Testing
Testing was performed by following the steps outlined here: #1961 (comment)
Before the change: Cluster would throw the exception within a minute
After the change: Cluster is stable and running without runtime exception for a long time
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.