-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Crypto plugin abstractions, encrypted repositories and metadata mgmt #7284
Crypto plugin abstractions, encrypted repositories and metadata mgmt #7284
Conversation
8b7121c
to
ff21b8a
Compare
Gradle Check (Jenkins) Run Completed with:
|
Gradle Check (Jenkins) Run Completed with:
|
ff21b8a
to
c6d4735
Compare
Gradle Check (Jenkins) Run Completed with:
|
c6d4735
to
e9aa5e1
Compare
Gradle Check (Jenkins) Run Completed with:
|
e9aa5e1
to
0f14430
Compare
20d2261
to
fb3912b
Compare
Gradle Check (Jenkins) Run Completed with:
|
fb3912b
to
cfb2b32
Compare
Gradle Check (Jenkins) Run Completed with:
|
cfb2b32
to
c395256
Compare
Gradle Check (Jenkins) Run Completed with:
|
c395256
to
aff2fb8
Compare
Gradle Check (Jenkins) Run Completed with:
|
aff2fb8
to
6c263a6
Compare
Gradle Check (Jenkins) Run Completed with:
|
Signed-off-by: Vikas Bansal <[email protected]>
6c263a6
to
66d34f3
Compare
Gradle Check (Jenkins) Run Completed with:
|
public PutRepositoryRequest(StreamInput in) throws IOException { | ||
super(in); | ||
name = in.readString(); | ||
type = in.readString(); | ||
settings = readSettingsFromStream(in); | ||
verify = in.readBoolean(); | ||
|
||
if (in.getVersion().onOrAfter(Version.V_3_0_0)) { | ||
encrypted = in.readOptionalBoolean(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this an optional boolean? An optional boolean has three states, so what does null
mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is marked as optional to ensure backward compatibility with existing integrations. We don't want existing repository registrations to start failing from this version. We can treat a null encrypted field as a non encrypted repo. If we move it to readBoolean
then it will become a breaking change. To add more info, stream based constructors are typically used in node to node communication.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optional doesn't give you backwards compatibility; when invoking writeOptionalBoolean
you're still unconditionally writing a byte to the stream and the reader must unconditionally read this byte as well. The onOrAfter
version check gives you backwards compatibility. I think you can just write a regular boolean and default the field to false if not explicitly set when constructing an instance. I think you can also use a primitive boolean type in the class and avoid the null
case all together.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't referring to version backwards compatibility but the ability for an existing api to work without any change with the new field introduction during let's say upgrading the cluster. Let me check this further if I can move it to non optional field without breaking anything and without enforcing the user to pass encrypted field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked into this and following are the two cases :
- We keep encrypted field
Boolean
and usewriteBoolean
orreadBoolean
methods instead of their optional counterparts. In this case lets say you have 2 nodes cluster. You trigger a put repository request on non active cluster manager node (say node2) without encrypted field. Node2 will try to form the request for cluster manager node usingwriteBoolean
and it will fail with NPE as it expects a non-null boolean value. If we had used an optional Boolean then it would have done a null check first, written a byte value of 2 and sent request to cluster manager node. On cluster manager node, again it would require optional method to read since it needs to handle the possibility of null value. Basically, to keep a non-primitive field and to use non-optional methods, users will have to change their contracts. - We keep primitive version this field. If you do this then you don't have a null case anymore which means that in all the get calls users will start seeing encrypted field. Again, if there are strict contracts on user side to read these responses, then they will start failing as well because of the unknown field. By keeping non-primitive fields users can selectively change contracts for encryption and other users continue to stay unchanged.
Correct me if I am missing something here which you are trying to point!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, you're right that in order to preserve the "not set" state in the repository settings then we'll need the optional boolean to communicate that third possible state.
if (repository.getMetadata() != null | ||
&& (repository.getMetadata().encrypted() != null || repository.getMetadata().cryptoMetadata() != null)) { | ||
listener.onFailure( | ||
new RepositoryException(repository.getMetadata().name(), "Snapshot creation for an encrypted repository is not supported") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems a bit strange to me. This PR introduces client side encryption as a property of the repository, so shouldn't everything that interacts with a repository (via the repository interface) use that encryption? Looking at the next PR (#7337), the logic for using the CryptoPlugin is baked into the remote store business logic, which means that all features using the repository interface will have to do the same integration. It also looks like data is conditionally encrypted in that PR (metadata is left in plain text). That also seems strange given that encryption is a property of the repository. Shouldn't all data in an "encrypted" repository be encrypted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically, with the approach shown in #7337, I think "encryption" there is a property of the remote store feature. Then it could make sense that the remote store feature would decide which data is encrypted and which is left in plain text. If encryption is a property of the repository, then I think all data in that repository should be encrypted, and also the encryption should be transparent to all users of that repository (i.e. baked into the implementation of the repository interface).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- In this PR, we don't intend to enable encryption for a snapshot repository. To avoid giving an impression of an encrypted repository during a snapshot repo registration, this validation is explicitly failing such request. We will plan to put this into OS documentation as well. Enabling encryption for snapshot repos will require decent amount of changes. We can surely take it up as a separate PR though.
- Encryption is a heavy compute operation and has a direct impact on the overall performance. Metadata encryption was intentionally left out for this reason. In fact, I was wondering if we can skip checkpoint encryption as well. Customer data is only exposed in translog and segment files so it makes sense to secure those instead of encrypting everything. Basically, we will take a good hit on performance if we want to ensure transparency. We can probably add this in our OS documentation as well to make it known what exactly is being encrypted (secured).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason why I didn't put it under remote_store block is because enabling encryption requires few key provider settings. Crypto clients are created using those settings. Now, if we move it to remote_store block, every time we create an index, we have to explicitly pass those settings. Unless we have a use case to encrypt one index on one repo and keep another index in plain text on same repo, this seems a duplicated effort. If user wants to reuse same s3 bucket for instance, we anyways have this abstraction of repository where we can create a new repo out of the same bucket.
So, now the only option left for us is to do it on a repository. Otherwise we have to expose another REST api to configure encryption and then use its identifier/name during index creation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tagging @Bukhtawar @sachinpkale @gbbafna as well for their views on point 2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The decision is data driven only. We have already established that encryption takes huge amount of CPU. Adding more such compute will naturally affect performance. The path we should in fact take is to minimize these computes. We have carried out multiple perf runs and scenarios where CPU is already at its peak, adding any compute had a visible impact on throughput.
Point of adding "encrypted" at repository level is exactly that. Every feature using repository must check if the repository is encrypted or not. Having said that, today we don't have encryption enabled snapshots. With this we will have a way to build or forced to build encryption on top of snapshots as well. Repository is an abstraction and remote store and snapshots are two of its stakeholders. Adding encryption context on top of repository makes sure that any new feature building on top of repository makes sure that it accounts for encryption and if it doesn't support it then it clearly calls out that it produces insecure data. The point of argument here is that whether we should encrypt entire data in specific case of remote store.
I believe that we should be highly cognizant when we think of encrypting data. There are specialized hardware instances available just to minimize the impact of encryption.
We can think of it like this that when a repository used in conjunction with remote store, encrypts only user data. Snapshot has its own way of using a repository. When a file exceeds certain capacity, it doesn't maintain a single file on repository and instead maintains smaller blocks of it. So, the only guy who can make sense of these files is snapshot. Similarly, remote store can take the decision of what should be encrypted.
Why should encryption affect the behavior of BlobContainer? Both are two completely independent tasks. You can take a look at the second PR. There is no change in BlobContainer interface. As proposed in RFC also, I have built it as a loosely coupled module which can be tied to any type of content. It is not just limited to repository features. We are starting to use it for repository but it can be extended to any other content type as well.
Proposal I am looking for is a place to maintain the metadata for encryption. Also, this metadata will go in cluster state so wherever you propose to put it, should have a respective cluster manager action (existing or new) responsible for maintaining it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm advocating for implementing client-side repository encryption at the repository layer so that features that use the repository do not need to change based on whether the repository is using client-side encryption or not. I'm not suggesting that we change any interfaces or tightly couple the encryption logic to any repository implementation. I believe this could be implemented as wrapper of BlobStore and/or BlobContainer that does the encryption and decryption logic before delegating to a concrete repository implementation. The only reason I can see for not doing that is so that features can exempt certain files from encryption for performance reasons. I don't believe it is sufficient to make this call by saying that encryption is CPU intensive. It matters how much this impacts performance. How much overhead does encrypting the metadata add?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only reason I can see for not doing that is so that features can exempt certain files from encryption for performance reasons.
Just want to make sure my assumption here is correct. Is exempting metadata from encryption the only reason not to make encryption transparent to the users of the repository layer? PR #7337 shows all the undifferentiated work that each feature will have to do regarding wiring the CryptoClient, creating the encrypting stream, etc and it just seems to me that it would be a big win to do this once instead of per-feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A good analogy to this is dm-crypt
that encrypts the block device and all files using the device get to be encrypted as against fs-crypt
that allows encryption at a filesystem level meaning each file can be encrypted differently and also choose not be encrypted at all.
So, now we can argue in favour of more flexibility and control, to leave the choice to the consumers of the repo, however we also need to understand if that is something that is needed now or can we build it later based on specific use cases, while keeping the contracts open. As @andrross pointed, this doesn't come for free, the changes on the integration PRs are too deep integrations and would need significant effort for consumers to integrate. Having said that if you can ensure that the integrations are light-weight and the new interfaces do most of the heavy lifting, I would still be glad considering moving on this approach. Essentially what I am looking for is moving what we call the CryptoClient
to still be bootstrapped in the repository module/service. Then at the BlobContainer
provide mechanisms to encypt and controls to opt-out as needed.
Having said that, I usually would advocate for simplicity and maintainability wherever possible. I am not sure if security/compliance needs metadata to be encrypted as well. Given metadata is usually really small, I doubt that there would be a substantial CPU implication. Even if there is 1-2% moderate CPU increase, it might still be worth the encryption.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can run some perfs to record the difference between selective encryption and repo level encryption. Depending upon the results we can take a call.
Then at the BlobContainer provide mechanisms to encypt and controls to opt-out as needed.
I would like to bring into notice few things so that we can make an informed decision. Repository and encryption have very different behaviors. For e.g. methods like delete, listBlobs, blobExists and similar methods have no relation with encryption. Also, read and writes cannot be used as they are and would require an offset adjustment and file header reads for ops acting on partial content. On top of that, it would require a completely new plugin to be built (encrypted repo plugin) if we push it at repository level. This is because the current layout of repository implementation is RepositoryPlugin -> Repository -> BlobStore -> BlobContainer. For adding crypto around BlobContainer, it would take a completely new delegation layer right to the top till RepositoryPlugin which will force us to build a new plugin unless we think of a different repository design. Please note that the new plugin we are currently building here has no relation with repository as of now. Check this PR.
If we are saying that when we talk about data in OpenSearch, we are always going to associate it with repository then I too agree with the whole approach and in this case we will have to modify existing repository abstractions to accommodate partial encryption and decryption use cases. Otherwise, we will not be able to account for non-repository encryption use cases. For e.g., if tomorrow we want to provide encryption and decryption support for local indexes as well then we will need a non-repo flavor of crypto. This means we would need both encrypted repository plugin to support repository encryption and a separate crypto plugin to support local index encryption. Crypto plugin can probably be forked out for reusability and encrypted repository plugin can depend on crypto plugin. Again, I still think that we are trying to fit two incompatible pieces together where repository is a storage layer and encryption is a transformation layer.
Still, keeping an open mind to hear your thoughts around this. Nevertheless, I think PRs #8465 and #8466 would still be required irrespective of any route we take.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR introduces client side encryption as a property of the repository
Enabling encryption for snapshot repos will require decent amount of changes. We can surely take it up as a separate PR though.
+1 what @andrross said. In general the whole design is confusing. The feature proposal and design proposal work backwards from" ...an encryption plugin which can provide capabilities to support encryption/decryption". But this PR is all about encrypting repository data.
I honestly can't grok what the overall objective is? It seems you are focused on encryption at rest? If so. What files are you wanting to encrypt? (feel free to enumerate here). Are you wanting encryption over the wire as well? This PR is just focuses on snapshots, yet it's introducing an entire plugin abstraction framework without a concrete application to anything other than snapshot files and no clear use cases or examples how a downstream plugin leverages this framework. Why introduce a whole new CryptoPlugin
and not just use SPI w/ a simple library interface for downstreams to define their encryption mechanism? It seems that makes more sense if the objective is to enable downstreams to have different encryption strategies.
/** | ||
* Crypto plugin interface used for encryption and decryption. | ||
*/ | ||
public interface CryptoClient extends RefCounted { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strange naming. This isn't a client. It's an interface to provide concrete stream encryption methods? Why not put this in a :libs:opensearch-security
library instead of server?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, will rename this! Implementation is there in libs, I can move this as well if it makes more sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel more like a CryptoProvider
* @param repositoryMetadata repository metadata for which crypto client needs to be returned. | ||
* @return crypto client | ||
*/ | ||
public CryptoClient cryptoClient(RepositoryMetadata repositoryMetadata) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method isn't used anywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is being used in other integration PR. Let me move this over there.
From this comment:
@reta would Aiven consider promoting encrypted-repository-opensearch as a core plugin to offer index/repository encryption as a default security option? If so, then I'd like to see us improve that to fill some of the random access encryption gaps needed by this proposal guided by the coming "OnThe Fly Encryption Feature Proposal" and PoC in #3469. Perhaps this PR just gets trimmed to providing the key registry? |
@nknize absolutely
Makes perfect sense, I will bring that to the team, thank you! |
@nknize Thanks for the review! If you look at other PRs, there is no context of snapshot, remote store or s3 plugin. It also provides the Structure of the plugin is as follows :
Overall objective it to provide encryption/decryption support irrespective of the source and target location of the content. Repository encryption is just one of the use cases.
For remote backed indexes, plan is to encrypt only the user data - translog and segment files.
No, objective of this is to provide encryption of data at rest. I don't think encrypted-repository-opensearch can be extended as it is tightly coupled with repository plugin and doesn't provide partial encryption or decryption support. If we want to extend support for local disk as well, then we should be able to do that as well with current framework. |
This is the spi which downstream plugin will implement to provide custom key providers. |
Closing this as it has been crossed out on the meta issue. @vikasvb90 Please reopen if this shouldn't be closed. |
Description
This change is to add crypto plugin abstractions, add required fieds in put repository request and validations for the same and cluster state changes to manage crypto metadata against a repository.
Issues Resolved
This is one of the tasks of meta issue #7229
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.