-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[DOCS] 1.0 guide for securely storing and accessing credentials and t…
…okens (#10157)
- Loading branch information
1 parent
dc0889e
commit 42ceec9
Showing
12 changed files
with
432 additions
and
7 deletions.
There are no files selected for viewing
67 changes: 67 additions & 0 deletions
67
...core/configure_project_settings/access_secrets_managers/_aws_secrets_manager.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,67 @@ | ||
import GxData from '../../_core_components/_data.jsx' | ||
import PreReqFileDataContext from '../../_core_components/prerequisites/_file_data_context.md' | ||
|
||
### Prerequisites | ||
|
||
- An AWS Secrets Manager instance. See [AWS Secrets Manager](https://docs.aws.amazon.com/secretsmanager/latest/userguide/tutorials_basic.html). | ||
- The ability to install Python packages with `pip`. | ||
- <PreReqFileDataContext/>. | ||
|
||
### Procedure | ||
|
||
1. Set up AWS Secrets Manager support. | ||
|
||
To use the AWS Secrets Manager with {GxData.product_name} you will first need to install the `great_expectations` Python package with the `aws_secrets` requirement. To do this, run the following command: | ||
|
||
```bash title="Terminal" | ||
pip install 'great_expectations[aws_secrets]' | ||
``` | ||
|
||
2. Reference AWS Secrets Manager variables in `config_variables.yml`. | ||
|
||
By default, `config_variables.yml` is located at: 'gx/uncomitted/config_variables.yml' in your File Data Context. | ||
|
||
Values in `config_variables.yml` that start with `secret|arn:aws:secretsmanager` will be substituted with corresponding values from the AWS Secrets Manager. However, if the keywords following `secret|arn:aws:secretsmanager` do not correspond to keywords in AWS Secrets Manager no substitution will occur. | ||
|
||
You can reference other stored credentials within the keywords by wrapping their corresponding variable in `${` and `}`. When multiple references are present in a value, the secrets manager substitution takes place after all other substitutions have occurred. | ||
|
||
An entire connection string can be referenced from the secrets manager. In this example, `dev_db_credentials` is the Secret Name in AWS Secrets Manager, and `connection_string` is the Secret Key that corresponds to the value to be retrieved: | ||
|
||
```yaml title="config_variables.yml" | ||
my_aws_creds: secret|arn:aws:secretsmanager:${AWS_REGION}:${ACCOUNT_ID}:secret:dev_db_credentials|connection_string | ||
``` | ||
Or each component of the connection string can be referenced separately. In these examples, `dev_db_credentials` remains the Secret Name in AWS Secrets Manager. However, rather than retrieving the value of the Secret Key `connection_string`, Secret Keys for individual parts of the connection string are provided for retrieval: | ||
|
||
```yaml title="config_variables.yml" | ||
drivername: secret|arn:aws:secretsmanager:${AWS_REGION}:${ACCOUNT_ID}:secret:dev_db_credentials|drivername | ||
host: secret|arn:aws:secretsmanager:${AWS_REGION}:${ACCOUNT_ID}:secret:dev_db_credentials|host | ||
port: secret|arn:aws:secretsmanager:${AWS_REGION}:${ACCOUNT_ID}:secret:dev_db_credentials|port | ||
username: secret|arn:aws:secretsmanager:${AWS_REGION}:${ACCOUNT_ID}:secret:dev_db_credentials|username | ||
password: secret|arn:aws:secretsmanager:${AWS_REGION}:${ACCOUNT_ID}:secret:dev_db_credentials|password | ||
database: secret|arn:aws:secretsmanager:${AWS_REGION}:${ACCOUNT_ID}:secret:dev_db_credentials|database | ||
``` | ||
|
||
Note that the last seven characters of an AWS Secrets Manager arn are automatically generated by AWS and are not mandatory to retrieve the secret. For example, the following two values retrieve the same secret: | ||
|
||
```yaml title="config_variables.yml" | ||
secret1: secret|arn:aws:secretsmanager:${AWS_REGION}:${ACCOUNT_ID}:secret:my_secret-1zAyu6 | ||
secret2: secret|arn:aws:secretsmanager:${AWS_REGION}:${ACCOUNT_ID}:secret:my_secret | ||
``` | ||
|
||
3. Optional. Reference versioned secrets. | ||
|
||
Unless otherwise specified, the latest version of the secret is returned by default. To get a specific version of the secret you want to retrieve, specify its version UUID. For example: | ||
|
||
```yaml title="config_variables.yml" | ||
versioned_secret: secret|arn:aws:secretsmanager:${AWS_REGION}:${ACCOUNT_ID}:secret:my_secret:00000000-0000-0000-0000-000000000000 | ||
``` | ||
|
||
4. Optional. Retrieve specific secrets from a JSON string. | ||
|
||
To retrieve a specific secret from a JSON string, include the JSON key after a pipe character `|` at the end of the secrets keywords. For example: | ||
|
||
```yaml title="config_variables.yml" | ||
json_secret: secret|arn:aws:secretsmanager:${AWS_REGION}:${ACCOUNT_ID}:secret:my_secret|<KEY> | ||
versioned_json_secret: secret|arn:aws:secretsmanager:${AWS_REGION}:${ACCOUNT_ID}:secret:my_secret:00000000-0000-0000-0000-000000000000|<KEY> | ||
``` |
62 changes: 62 additions & 0 deletions
62
...ocs/core/configure_project_settings/access_secrets_managers/_azure_key_vault.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,62 @@ | ||
import GxData from '../../_core_components/_data.jsx' | ||
import PreReqFileDataContext from '../../_core_components/prerequisites/_file_data_context.md' | ||
|
||
### Prerequisites | ||
|
||
- An [Azure Key Vault instance with configured secrets](https://docs.microsoft.com/en-us/azure/key-vault/general/overview). | ||
- The ability to install Python packages with `pip`. | ||
- <PreReqFileDataContext/>. | ||
|
||
### Procedure | ||
|
||
1. Set up Azure Key Vault support. | ||
|
||
To use Azure Key Vault with {GxData.product_name} you will first need to install the `great_expectations` Python package with the `azure_secrets` requirement. To do this, run the following command: | ||
|
||
```bash title="Terminal" | ||
pip install 'great_expectations[azure_secrets]' | ||
``` | ||
|
||
2. Reference Azure Key Vault variables in `config_variables.yml`. | ||
|
||
By default, `config_variables.yml` is located at: 'gx/uncomitted/config_variables.yml' in your File Data Context. | ||
|
||
Values in `config_variables.yml` that match the regex `^secret\|https:\/\/[a-zA-Z0-9\-]{3,24}\.vault\.azure\.net` will be substituted with corresponding values from Azure Key Vault. However, if the keywords in the matching regex do not correspond to keywords in Azure Key Vault no substitution will occur. | ||
|
||
You can reference other stored credentials within the regex by wrapping their corresponding variable in `${` and `}`. When multiple references are present in a value, the secrets manager substitution takes place after all other substitutions have occurred. | ||
|
||
An entire connection string can be referenced from the secrets manager: | ||
|
||
```yaml title="config_variables.yml" | ||
my_abs_creds: secret|https://${VAULT_NAME}.vault.azure.net/secrets/dev_db_credentials|connection_string | ||
``` | ||
Or each component of the connection string can be referenced separately: | ||
```yaml title="config_variables.yml" | ||
drivername: secret|https://${VAULT_NAME}.vault.azure.net/secrets/dev_db_credentials|host | ||
host: secret|https://${VAULT_NAME}.vault.azure.net/secrets/dev_db_credentials|host | ||
port: secret|https://${VAULT_NAME}.vault.azure.net/secrets/dev_db_credentials|port | ||
username: secret|https://${VAULT_NAME}.vault.azure.net/secrets/dev_db_credentials|username | ||
password: secret|https://${VAULT_NAME}.vault.azure.net/secrets/dev_db_credentials|password | ||
database: secret|https://${VAULT_NAME}.vault.azure.net/secrets/dev_db_credentials|database | ||
``` | ||
3. Optional. Reference versioned secrets. | ||
Unless otherwise specified, the latest version of the secret is returned by default. To get a specific version of the secret you want to retrieve, specify its version id (32 alphanumeric characters). For example: | ||
```yaml title="config_variables.yml" | ||
versioned_secret: secret|https://${VAULT_NAME}.vault.azure.net/secrets/my-secret/a0b00aba001aaab10b111001100a11ab | ||
``` | ||
4. Optional. Retrieve specific secrets for a JSON string. | ||
To retrieve a specific secret for a JSON string, include the JSON key after a pipe character `|` at the end of the secrets regex. For example: | ||
|
||
```yaml title="config_variables.yml" | ||
json_secret: secret|https://${VAULT_NAME}.vault.azure.net/secrets/my-secret|<KEY> | ||
versioned_json_secret: secret|https://${VAULT_NAME}.vault.azure.net/secrets/my-secret/a0b00aba001aaab10b111001100a11ab|<KEY> | ||
``` | ||
|
||
|
112 changes: 112 additions & 0 deletions
112
.../core/configure_project_settings/access_secrets_managers/_gcp_secret_manager.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,112 @@ | ||
import GxData from '../../_core_components/_data.jsx' | ||
import PreReqFileDataContext from '../../_core_components/prerequisites/_file_data_context.md' | ||
|
||
### Prerequisites | ||
|
||
- A [GCP Secret Manager instance with configured secrets](https://cloud.google.com/secret-manager/docs/quickstart). | ||
- The ability to install Python packages with `pip`. | ||
- <PreReqFileDataContext/>. | ||
|
||
### Procedure | ||
|
||
1. Set up Azure Key Vault support. | ||
|
||
To use Azure Key Vault with {GxData.product_name} you will first need to install the `great_expectations` Python package with the `gcp` requirement. To do this, run the following command: | ||
|
||
```bash title="Terminal" | ||
pip install 'great_expectations[gcp]' | ||
``` | ||
|
||
2. Reference GCP Secret Manager variables in `config_variables.yml`. | ||
|
||
By default, `config_variables.yml` is located at: 'gx/uncomitted/config_variables.yml' in your File Data Context. | ||
|
||
Values in `config_variables.yml` that match the regex `^secret\|projects\/[a-z0-9\_\-]{6,30}\/secrets` will be substituted with corresponding values from GCP Secret Manager. However, if the keywords in the matching regex do not correspond to keywords in GCP Secret Manager no substitution will occur. | ||
|
||
You can reference other stored credentials within the regex by wrapping their corresponding variable in `${` and `}`. When multiple references are present in a value, the secrets manager substitution takes place after all other substitutions have occurred. | ||
|
||
An entire connection string can be referenced from the secrets manager: | ||
|
||
```yaml title="config_variables.yml" | ||
my_gcp_creds: secret|projects/${PROJECT_ID}/secrets/dev_db_credentials|connection_string | ||
``` | ||
Or each component of the connection string can be referenced separately: | ||
```yaml title="config_variables.yml" | ||
drivername: secret|projects/${PROJECT_ID}/secrets/PROD_DB_CREDENTIALS_DRIVERNAME | ||
host: secret|projects/${PROJECT_ID}/secrets/PROD_DB_CREDENTIALS_HOST | ||
port: secret|projects/${PROJECT_ID}/secrets/PROD_DB_CREDENTIALS_PORT | ||
username: secret|projects/${PROJECT_ID}/secrets/PROD_DB_CREDENTIALS_USERNAME | ||
password: secret|projects/${PROJECT_ID}/secrets/PROD_DB_CREDENTIALS_PASSWORD | ||
database: secret|projects/${PROJECT_ID}/secrets/PROD_DB_CREDENTIALS_DATABASE | ||
``` | ||
3. Optional. Reference versioned secrets. | ||
Unless otherwise specified, the latest version of the secret is returned by default. To get a specific version of the secret you want to retrieve, specify its version id. For example: | ||
```yaml title="config_variables.yml" | ||
versioned_secret: secret|projects/${PROJECT_ID}/secrets/my_secret/versions/1 | ||
``` | ||
4. Optional. Retrieve specific secrets for a JSON string. | ||
To retrieve a specific secret for a JSON string, include the JSON key after a pipe character `|` at the end of the secrets regex. For example: | ||
|
||
```yaml title="config_variables.yml" | ||
json_secret: secret|projects/${PROJECT_ID}/secrets/my_secret|<KEY> | ||
versioned_json_secret: secret|projects/${PROJECT_ID}/secrets/my_secret/versions/1|<KEY> | ||
``` | ||
|
||
|
||
|
||
|
||
|
||
|
||
Configure your Great Expectations project to substitute variables from the Google Cloud Secret Manager. Secrets store substitution uses the configurations from your ``config_variables.yml`` file after all other types of substitution are applied with environment variables. | ||
|
||
Secrets store substitution uses keywords and retrieves secrets from the secrets store for values matching the following regex ``^secret\|projects\/[a-z0-9\_\-]{6,30}\/secrets``. If the values you provide don't match the keywords, the values aren't substituted. | ||
|
||
1. Run the following code to install the ``great_expectations`` package with the ``gcp`` requirement: | ||
|
||
```bash | ||
pip install 'great_expectations[gcp]' | ||
``` | ||
|
||
2. Provide the name of the secret you want to substitute in GCP Secret Manager. For example, ``secret|projects/project_id/secrets/my_secret``. | ||
|
||
The latest version of the secret is returned by default. | ||
|
||
3. Optional. To get a specific version of the secret, specify its version id. For example, ``secret|projects/project_id/secrets/my_secret/versions/1``. | ||
|
||
4. Optional. To retrieve a specific secret value for a JSON string, use ``secret|projects/project_id/secrets/my_secret|key`` or ``secret|projects/project_id/secrets/my_secret/versions/1|key``. | ||
|
||
5. Save your access credentials or the database connection string to ``great_expectations/uncommitted/config_variables.yml``. For example: | ||
|
||
```yaml | ||
# We can configure a single connection string | ||
my_gcp_creds: secret|projects/${PROJECT_ID}/secrets/dev_db_credentials|connection_string | ||
# Or each component of the connection string separately | ||
drivername: secret|projects/${PROJECT_ID}/secrets/PROD_DB_CREDENTIALS_DRIVERNAME | ||
host: secret|projects/${PROJECT_ID}/secrets/PROD_DB_CREDENTIALS_HOST | ||
port: secret|projects/${PROJECT_ID}/secrets/PROD_DB_CREDENTIALS_PORT | ||
username: secret|projects/${PROJECT_ID}/secrets/PROD_DB_CREDENTIALS_USERNAME | ||
password: secret|projects/${PROJECT_ID}/secrets/PROD_DB_CREDENTIALS_PASSWORD | ||
database: secret|projects/${PROJECT_ID}/secrets/PROD_DB_CREDENTIALS_DATABASE | ||
``` | ||
|
||
6. Run the following code to use the `connection_string` parameter values when you add a `datasource` to a Data Context: | ||
|
||
```python | ||
# We can use a single connection string | ||
pg_datasource = context.data_sources.add_or_update_sql( | ||
name="my_postgres_db", connection_string="${my_gcp_creds}" | ||
) | ||
# Or each component of the connection string separately | ||
pg_datasource = context.data_sources.add_or_update_sql( | ||
name="my_postgres_db", connection_string="${drivername}://${username}:${password}@${host}:${port}/${database}" | ||
) | ||
``` |
27 changes: 27 additions & 0 deletions
27
...cs/core/configure_project_settings/access_secrets_managers/_secrets_managers.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
import TabItem from '@theme/TabItem'; | ||
import Tabs from '@theme/Tabs'; | ||
import GxData from '../../_core_components/_data.jsx' | ||
|
||
import AwsSecretsManager from './_aws_secrets_manager.md'; | ||
import GcpSecretManager from './_gcp_secret_manager.md'; | ||
import AzureKeyVault from './_azure_key_vault.md'; | ||
|
||
{GxData.product_name} supports the AWS Secrets Manager, Google Cloud Secret Manager, and Azure Key Vault secrets managers. | ||
|
||
Use of a secrets manager is optional. [Credentials can be securely stored as environment variables or entries in a yaml file](#configure-credentials) without referencing content stored in a secrets manager. | ||
|
||
<Tabs queryString="manager_type" groupId="manager_type" defaultValue='aws' values={[{label: 'AWS Secrets Manager', value:'aws'}, {label: 'GCP Secret Manager', value:'gcp'}, {label: 'Azure Key Vault', value:'azure'}]}> | ||
|
||
<TabItem value="aws"> | ||
<AwsSecretsManager/> | ||
</TabItem> | ||
|
||
<TabItem value="gcp"> | ||
<GcpSecretManager/> | ||
</TabItem> | ||
|
||
<TabItem value="azure"> | ||
<AzureKeyVault/> | ||
</TabItem> | ||
|
||
</Tabs> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
--- | ||
title: Access secrets managers | ||
description: Access credentials that are stored in AWS Secrets Manager, GCP Secret Manager, or Azure key vault. | ||
hide_feedback_survey: false | ||
hide_title: false | ||
--- | ||
|
||
import TabItem from '@theme/TabItem'; | ||
import Tabs from '@theme/Tabs'; | ||
import GxData from '../../_core_components/_data.jsx' | ||
|
||
import AwsSecretsManager from './_aws_secrets_manager.md'; | ||
import GcpSecretManager from './_gcp_secret_manager.md'; | ||
import AzureKeyVault from './_azure_key_vault.md'; | ||
|
||
{GxData.product_name} supports the AWS Secrets Manager, Google Cloud Secret Manager, and Azure Key Vault secrets managers. | ||
|
||
Use of a secrets manager is optional. [Credentials can be securely stored as environment variables or entries in a yaml file](core/configure_project_settings/configure_credentials/configure_credentials.md) without referencing content stored in a secrets manager. | ||
|
||
<Tabs queryString="manager_type" groupId="manager_type" defaultValue='aws' values={[{label: 'AWS Secrets Manager', value:'aws'}, {label: 'GCP Secret Manager', value:'gcp'}, {label: 'Azure Key Vault', value:'azure'}]}> | ||
|
||
<TabItem value="aws"> | ||
<AwsSecretsManager/> | ||
</TabItem> | ||
|
||
<TabItem value="gcp"> | ||
<GcpSecretManager/> | ||
</TabItem> | ||
|
||
<TabItem value="azure"> | ||
<AzureKeyVault/> | ||
</TabItem> | ||
|
||
</Tabs> |
15 changes: 15 additions & 0 deletions
15
...cs/core/configure_project_settings/configure_credentials/_access_credentials.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
import GxData from '../../_core_components/_data.jsx'; | ||
|
||
Securely stored credentials are implemented via string substitution. You can reference your credentials in a Python string by wrapping the variable name they are assigned to in `${` and `}`. Using individual credentials for a connection string would look like: | ||
|
||
```python title="Python" | ||
connection_string="postgresql+psycopg2://${MY_POSTGRES_USERNAME}:${MY_POSTGRES_PASSWORD}@${POSTGRES_HOST}:${POSTGRES_PORT}/${POSTGRES_DATABASE}", | ||
``` | ||
|
||
Or you could reference a configured variable that contains the full connection string by providing a Python string that contains just a reference to that variable: | ||
|
||
```python title="Python" | ||
connection_string="${POSTGRES_CONNECTION_STRING}" | ||
``` | ||
|
||
When you pass a string that references your stored credentials to a {GxData.product_name} method that requires string formatted credentials as a parameter the referenced variable in your Python string will be substituted for the corresponding stored value. |
32 changes: 32 additions & 0 deletions
32
...aurus/docs/core/configure_project_settings/configure_credentials/_config_yml.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
YAML files make variables more visible, are easier to edit, and allow for modularization. For example, you can create a YAML file for development and testing and another for production. | ||
|
||
A File Data Context is required before you can configure credentials in a YAML file. By default, the credentials file in a File Data Context is located at `/great_expectations/uncommitted/config_variables.yml`. The `uncommitted/` directory is included in a default `.gitignore` and will be excluded from version control. | ||
|
||
These examples demonstrate how to save credentials in the form of a connection string for a database. However, the same process can be used for things such as web app tokens or any other credential that can be stored in string format. | ||
|
||
Each entry in `config_variables.yml` should consist of two parts. The first is a variable which you will reference in the place of the credential. The second is the value that should be substituted for that variable when it is referenced. For example: | ||
|
||
```bash title="config_variables.yml" | ||
MY_POSTGRES_USERNAME: <USERNAME> | ||
MY_POSTGRES_PASSWORD: <PASSWORD> | ||
``` | ||
|
||
or: | ||
|
||
```bash title="config_variables.yml" | ||
POSTGRES_CONNECTION_STRING: postgresql+psycopg2://<USERNAME>:<PASSWORD>@<HOST>:<PORT>/<DATABASE> | ||
``` | ||
|
||
You can also reference your stored credentials within a stored connection string by wrapping their corresponding variable in `${` and `}`. For example: | ||
|
||
```bash title="config_variables.yml" | ||
MY_POSTGRES_USERNAME: <USERNAME> | ||
MY_POSTGRES_PASSWORD: <PASSWORD> | ||
POSTGRES_CONNECTION_STRING: postgresql+psycopg2://${MY_POSTGRES_USERNAME}:${MY_POSTGRES_PASSWORD}@<HOST>:<PORT>/<DATABASE> | ||
``` | ||
|
||
Because the dollar sign character `$` is used to indicate the start of a string substitution they should be escaped using a backslash `\` if they are part of your credentials. For example, if your password is `pa$$word` then in the previous examples you would use the command: | ||
|
||
```bash title="Terminal" | ||
export MY_POSTGRES_PASSWORD=pa\$\$word | ||
``` |
Oops, something went wrong.