Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

At-rest encryption proposal to protect secrets in database #355

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

tsaarni
Copy link

@tsaarni tsaarni commented Sep 18, 2024

In the earlier Vault SPI design proposal, a second phase was proposed to introduce encryption and decryption capabilities. This pull request builds upon that and proposes adding optional at-rest encryption support to protect secrets while they are stored in the database.

@tsaarni
Copy link
Author

tsaarni commented Sep 18, 2024

I've created a discussion thread also here: keycloak/keycloak#33045

Copy link
Contributor

@stianst stianst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I read it correctly the intent is to intercept specific values such as client secret, to then not store these in the Keycloak DB, but rather store these in the Encryption SPI provider?

If that's the case it might be better to extend on Vault SPI to allow write of secrets, and not just reads, rather than introduce another SPI completely. Especially considering the fact that you would need to read the secrets, which the Vault SPI already takes care of.

This can include fields like username, email, or other fields that can be considered as privacy-sensitive but need to be searchable.


### Performance
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I very much doubt this would scale well for things like client secrets and OTP secrets.

For client secrets hashing may be a better option than storing in an external vault.

Copy link
Author

@tsaarni tsaarni Sep 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are probably right that the performance would not be sufficient.

I came across the discussion you've had regarding hashing and it seems like a good option as well. My concern is that it only focuses on client secrets, and using encryption might allow us to avoid the backward compatibility issues caused by hashing.

I’ve been studying Kubernetes' KMSv2 implementation for securing Secrets. They've managed to create a highly efficient solution by minimizing calls to external encryption services, though it adds complexity on the Kubernetes side rather than in the KMS side. At a high level:

  • Each secret is encrypted locally using a data encryption key (DEK).
  • A new DEK is generated locally with every secret write.
  • DEK generation is based on a key derivation algorithm that uses a seed.
  • The seed is encrypted by the remote encryption service.

I believe this approach could work for us as well, and I can present a similar strategy.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hashing does provide an extra layer of security in the fact that the original secret for a client can not be retrieved again through REST APIs. In general I think any secrets stored in a vault (or hashed or whatever) should not be readable through REST APIs.


> **TODO**: Q: Should Keycloak include default provider for Encryption SPI?
>
> Some possible alternatives:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having an SPI without an implementation is not very elegant; it would more or less make it useless to most folks. Additionally it would be impossible to test fully, as well as do any sort of benchmarking, which really would be required for something like this.

HashiCorp is a defacto choice, but less so in the open source now that it has changed its license. So, not really sure what a default implementation (or implementations) would be.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on your other comment we could in theory have two implementations; one that stores in DB directly as encrypted values (that should of course be able to receive its own encryption keys from a secret, or from HashiCorp, etc.); and another that stores in an external vault like HashiCorp.

* **Status**: Draft #1
* **Github**: TODO

## Motivation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to see this proposal come with some details on best practices, and compare with DB encryption

@tsaarni
Copy link
Author

tsaarni commented Sep 20, 2024

Thanks for your response @stianst!

If I read it correctly the intent is to intercept specific values such as client secret, to then not store these in the Keycloak DB, but rather store these in the Encryption SPI provider?

No, what I meant is that the data is stored in a database, so the storage itself remains unchanged. The encryption SPI's role would be limited to encrypting the data before storage and decrypting it after retrieval, but before use.

Ideally, this should also simplify the administrator's task by eliminating the need to manage references or deal with configuring the remote vault (e.g., defining the schema for how individual Keycloak parameters are stored in key-value secret store of HashiCorp Vault).

@stianst
Copy link
Contributor

stianst commented Sep 23, 2024

Thanks for your response @stianst!

If I read it correctly the intent is to intercept specific values such as client secret, to then not store these in the Keycloak DB, but rather store these in the Encryption SPI provider?

No, what I meant is that the data is stored in a database, so the storage itself remains unchanged. The encryption SPI's role would be limited to encrypting the data before storage and decrypting it after retrieval, but before use.

Ideally, this should also simplify the administrator's task by eliminating the need to manage references or deal with configuring the remote vault (e.g., defining the schema for how individual Keycloak parameters are stored in key-value secret store of HashiCorp Vault).

Got it. That brings an interesting problem though when rotating the encryption keys. If the encrypted data is stored in the rows/columns where the secret would otherwise be, it would be more or less impossible to find all the encrypted values in order to re-encrypt them with new keys. An alternative to that may be a provider that uses a separate table to store encrypted values, with the encrypted value and a kid. That would allow supporting a passive key and an active key like we do for token signing for instance. Then the values can be re-encrypted with the new key either lazily or in a background task, and eventually the old key can be removed when all values are updated.

@tsaarni
Copy link
Author

tsaarni commented Oct 3, 2024

If the encrypted data is stored in the rows/columns where the secret would otherwise be, it would be more or less impossible to find all the encrypted values in order to re-encrypt them with new keys.

Unless metadata could be gathered, for example, from annotations on entity classes. I’m not sure if this would be feasible.

An alternative to that may be a provider that uses a separate table to store encrypted values, with the encrypted value and a kid. That would allow supporting a passive key and an active key like we do for token signing for instance. Then the values can be re-encrypted with the new key either lazily or in a background task, and eventually the old key can be removed when all values are updated.

An interesting alternative. How would referential integrity be maintained? When Hibernate executes a DELETE, how can we ensure that the corresponding secrets in the "encrypted data" table are also removed? I assume using a foreign key with cascading delete isn't feasible, as the "encrypted data" table would have multiple parent tables.

@tsaarni
Copy link
Author

tsaarni commented Oct 3, 2024

I've made a major update to the proposal.

To address the performance concerns it now recommends handling encryption logic locally, which could potentially become core Keycloak functionality. The key encryption key remains remote, requiring the introduction of an SPI and support for alternative providers.

@stianst wrote

I'd like to see this proposal come with some details on best practices, and compare with DB encryption

To address best practices, I’ve added a sections outlining use cases and encryption strategy. I’ve referenced techniques used in Kubernetes, since they have gone over several iterations to solve performance problems. I still need to add detailed crypto information, which I plan to include later with the help of a colleague who has strong expertise in this area. I removed the lazy encryption approach since, although simple, it doesn’t align with best practices and could leave users uncertain about the encryption status.

I’ve also included a chapter on encryption alternatives on the database-side.

One of the larger challenges to address is implementing batch operations like re-encryption, specifically solving how to locate secrets in the database. And should this be approached directly from an SQL perspective, or can it be done through the ORM?

@tsaarni tsaarni changed the title Initial version of encryption spi proposal At-rest encryption proposal to protect secrets in database Oct 3, 2024
@tsaarni
Copy link
Author

tsaarni commented Oct 14, 2024

Hi @stianst, just a gentle reminder to see if you've had a chance to review the update. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants