-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add first class support for client-side encryption of repositories #5800
Comments
Hi @nknize, @andrross, and @reta I'm the author of the All repository plugins, as you all know, just work with stream chunks, which the OpenSearch kernel sends to 3rd-party systems like HDFS, file systems, clouds, etc. To avoid different configurations of how they should encrypt data. I suggest using the KEK/DEK approach. What is DEK/KEK Data Encryption Key (DEK) Key encryption keys (KEK) KEK management rules
DEK management rules
For that, OpenSearch needs to have a Key Manager Service, which plugins will use to generate DEK keys. Key Management Service The key management service is a central service which OpenSearch provides to plugins to use for encryption. It can be integrated with 3rd party KMS e.g. Amazon, Google etc. There is only one responsibility for the key management service: to generate, encrypt, and decrypt DEK keys or additional data on demand. It can be configured to use different key management systems using names, so that plug-ins can use different key management systems. The POC interface: interface KeyManagerServiceProvider {
default KeyManagerService keyManagerService() {
return keyManagerService(“default”);
}
KeyManagerService keyManagerService(final String name);
}
interface KeyManagerService {
Supplier<SecretKey> dekGenerator();
byte[] decrypt(final byte[] data)
byte[] encrypt(final byte[] data);
} Keys rotation:
Default Key Management Service Configuration Key Store Settings
Key Management Service Configuration Key Store Settings For 3rd Party Providers
All management services can be configured independently. How it will look like for plugins In case of Key Management Service all plugins will use the same rules/recommendations for encryption: E.g. in case of PUT _snapshot/repository_name
{
"type": "encrypted",
"settings": {
"storage_type": "fs",
"key_manager_name: "some_kms",
"location": "/mount/backups/my_fs_backup_location"
}
} and during configuration of the repository If I'm missing something fill free to point out. P.S. In the end I would like to address some comments I read in PRs about the
TBH, I do not understand what partial encryption is. If it is about blocking encryption, I am afraid it was compromised; e.g., AES/CBC are not recommended for such encryption. Yes, it is possible to implement a custom solution for such case which again needs to be proven stable and will look like CBC. If it is about AES/GCS and the limitation of the stream size up to 64GB due to the IV and the fact that it must be renewed, well, it's a natural limitation of AES/GCM which is solved via the hard limit of a chuck size of 64 GB in the plugin since the default chuck size for all clouds is 5 TB, and in this case the whole stream will be compromised.
It does support chunking this solution was select this way since it is responsibility of OpenSearch (in the past ES) to split streams into chunks which it does perfectly fine.
I do not understand what a custom wrapper in this case. The plugin uses classical algorithms and cryptographic primitives e.g. AES 256-bit keys in Galois Counter Mode which were proved to be stable for such tasks. The only reason why BC was selected is that the problem of JDK which does not support streaming with more than 2GB and it is only one well known and trusted solution in opensource world. It is possible to use AWS or Google libraries, but I do not know how popular are they and it is up to the OpenSearch team.If it is about storing IV and encrypted data at the beginning of the stream, again, it is just an optimization. In this case, you don't need to do an additional network round trip.
Yes and it is only right solution since OpenSearch core controls when it wants to send data somewhere. Moving such CPU bounded tasks to the core will affect performance of OpenSerach. |
@willyborankin thanks a lot for elaborate details
It totally make sense from the perfect encryption standpoint, what information (if any) has to be stored with the snapshot (key provider? key name/id? nothing?) in order to decrypt it? |
Hi @reta, I added KEK/DEK management rules to the description. |
Thanks @willyborankin!
@willyborankin I think the key point here is that
I'm also not clear here. Is this referring to the unimplemented Tagging @vikasvb90 to share his thoughts here. |
Hi @andrross,
I think key manager service should an internal OpenSearch service, sorry my explanation is not clear.
Ahh AFAIK AES/CTR support random access and you can encrypt and decrypt it in parallel, GCM is about integrity ans speed. This check was added due to the fact that in ES it used (I do not know how it looks now) for searchable snapshots.
|
Thanks for writing this up @willyborankin there is a lot to like with this approach, couple areas for your consideration: Protect decryption?I believe in OpenSearch plugin ecosystem runtime if you can find the data you can ask for the key and decrypt it with this interface. Could we protect the key generation process so the keys only match the same plugin, would that be useful? Static declarations vs dynamically providers?As outlined, OpenSearch configuration settings know how to find the KMS providers, what would you think about having KMS providers be dynamically on/off 'lined? Having (de)encryption fall out of sync between nodes could create some painful operator scenarios, could this be mitigated? |
@willyborankin
This is not about using block based algorithms or specifically limiting to any deprecated algorithm which supports this. All standard committing algorithms used in AWS encryption SDK like AES/GCM can be used for this. As @andrross pointed out, this is about randomly encrypting or decrypting a part of the content.
You can look at this PR. It uses frame based encryption. Frame is the smallest unit of encryption or decryption of default size 4KB. Frame based encryption is the default mode for encryption and committing algos like AES/GCM. Both ALG_AES_256_GCM_HKDF_SHA512_COMMIT_KEY and ALG_AES_256_GCM_HKDF_SHA512_COMMIT_KEY_ECDSA_P384 work. Other non committing algorithms like IV algos are not recommended for use which is what I think you are also trying to point. We cannot disable random encryption or decryption as it is required for features like searchable snapshots. We do need a solution for this. This can be achieved in frame encryption by ensuring that encrypt and decrypt operations always happen along frame boundaries, metadata headers which consist of crypto materials are added only while encrypting the first frame of the content and frame numbers are appropriately assigned. This is what the PR above ensures. As long as these conditions are met, you don't have to encrypt or decrypt the entire content just to access a small part of it. Now, coming to your repository based design, I have a number of concerns with this approach.
Not sure how tight coupling is ensuring less CPU. It is solving a specific design issue of encrypting whole content and being agnostic of it during transfers. In fact, it has an additional performance overhead where there is a forced need of encrypting everything which is not there in decoupled flavor where you can selectively encrypt or decrypt. May be you are referring to encryption being applied at file system where local IO is more frequent like in case of segment merging, tlog/ckp files, etc. as compared to snapshot transfers. If yes, then I am not again suggesting it to be built here. In my opinion it should not be tied to any storage system - local or remote and should be used as a tool to just transform data as per the need. It should be agnostic of location of source or target content or how we plan to use the resultant content. Also, I don't think it is a good idea to build KMS as a service within OpenSearch. It should be an extensible plugin where community is free to implement their own versions of KMS. I also don't see the need of adding key stores within cluster state. This means that for every key rotation, this would require a cluster state update. In my opinion, cluster state can just keep immutable key provider settings.
About generating DEKs locally, both are contradicting statements unless I am missing something. And you mentioned |
Thank you for the answer.
By random encryption I think you mean random access file right?
It was done this way just to avoid changing existing repositories: S3, HDFS, GCS, Azure, and FS. In your solution, you need to change all exiting repository plugins, and each needs to be configured to use a different key store management with its own key store. Such solution is very difficult to support from the OPS point of view.
I got your point about special fields thank you. IMHO this is about CSE :-).
I checked one more time my description and did not find anything related to the storing KMS in the cluster state sorry.
Do you mean KEK, which a 3rd-party system stores? KEK/DEK assumes that you use KEK for encryption of DEK keys. And getting the latest info about KEK is a remote call since it is the responsibility of 3rd-party systems, while DEK generates via any key generator you have in any library. Besides, storing DEKs together with KEKs is not a good idea. |
Thank for you,
Technically, it is possible e.g., by creating a new KMS and changing its state via listener if something has changed in the key store. AFAIK, OpenSearch core supports listeners but I'm not sure about it.
Good point! Sometimes it is necessary block something and after that rotate keys. |
Thanks for the reply @willyborankin
I have verified random access with AES/GCM committing algos. Carried out few perfs as well. I have added few tests to ensure the integrity. Are you suggesting verifying this against any other stable algo?
That is exactly why I am not inclined to tie it to repository abstraction and keep it as a separate transformation layer. Since there is no straightforward way to accommodate it at repository level as you also mentioned, a feature will selectively encrypt or decrypt content before invoking repository transfers.
There are existing products like this which already offer these capabilities. Even if you feel it is not a good reason, then also we have existing random encryption/decryption use cases like searchable snapshots and multi-part parallel transfers because of which we need to decouple repository with crypto. This is because of the point which you also mentioned where accommodating random access would require changes across all repository plugins and may not be a practical approach.
This line mentions key rotation and on its event updating cluster settings. This is why I mentioned that we shouldn't be updating cluster state or settings on rotation as this is an expensive operation. And since we are caching encrypted or generated key pairs, it is ok to have individual nodes maintain their own data keys.
Got it! Just to further confirm that when you say DEK are you referring to raw version of encrypted keys (KEK in your case)? I agree that we shouldn't be putting raw keys along with encrypted content. I am not suggesting this either. This beats the whole point of encryption. Now, I think you are suggesting to use encrypt abstraction of key providers which basically encrypt the data keys or used in wrapping keys. Why not use generateDataPair itself which provides both data key and encrypted key as a key pair and saves the overhead of generating keys locally? Locally generating keys is anyways not sufficient since we anyways need to add encrypted version of the key along with the content. encrypt is generally used for wrapping use cases where multiple keys are used for encryption/decryption. |
@willyborankin This is a meta issue which has all the PRs of crypto. Please take a look and let me know if anything is missing #7229 . |
@vikasvb90 Can you give more detail here? Why do we have to reimplement the crypto integration logic in every feature that uses it in order to support these capabilities? Also, I'll say again I'm strongly in favor of encrypting everything written to the repository, regardless of how the code is architected. If we exempt certain metadata files, for example, then I think every operator of OpenSearch that requires client-side encryption will have to do a thorough security audit to ensure the plain text metadata doesn't expose any information that they deem to be sensitive. Every update of OpenSearch may require such an audit as well as the contents of the metadata may change over time. It seems much simpler to side step these questions by encrypting everything. |
Thanks @vikasvb90. I will leave my comments during the week in your PR. |
Just want to summarize some offline conversations here. I think we agree on the following architectural points:
We'll continue to iterate on the PRs that @vikasvb90 has open and work toward consensus here. |
The repository interface in OpenSearch enables storing snapshot data in an external object store. It is implemented as a plugin interface, and the OpenSearch bundle by default ships with multiple plugin implementations (e.g. Google Cloud Storage, Amazon S3, Microsoft Azure). These object stores may offer server-side encryption, but OpenSearch itself offers no out-of-the-box mechanism to enable client-side encryption of the snapshot data. Both Amazon OpenSearch Service and Aiven have implemented client-side encryption, so there is clearly a need for this feature. The request here is to offer first class support for this feature within the OpenSearch Project.
We have multiple projects in flight (remote-backed index, searchable snapshots, etc) that are building on the repository interface. All of these stand to benefit from a client-side encryption feature.
Describe the solution you'd like
The Aiven implementation is an open source plugin that essentially wraps the repository interface, allowing users to compose the encryption plugin with the object store implementation of their choice. An obvious option here is to bring this plugin along side the other object store plugins that are bundled in the OpenSearch distribution.
Describe alternatives you've considered
There might be benefits to building client-side encryption more natively into the repository interface. The compose-ability of the Aiven implementation offers great flexibility, but it does mean that users must be aware of the plugin and configure it accordingly. Building it more directly into the interface could have usability benefits?
The text was updated successfully, but these errors were encountered: