-
Notifications
You must be signed in to change notification settings - Fork 48
Data Converters
This section lists all data converters that can be specified in the YAML configuration file.
These converters replace parts of the input value with random characters.
For example, the randomizeNumber
converter replaces all numeric characters with random numbers.
Only non-empty values are processed.
Converts all characters to random alphanumeric characters.
For example, one of the possible convertions for "john_doe" is "vO7s2pJx".
Parameters:
Name | Required | Default | Description |
---|---|---|---|
min_length | N | 3 |
The minimum length of the generated value (when not empty). |
replacements | N | Check here | A string that contains the replacement characters. |
Example:
tables:
my_table:
converters:
my_column:
converter: 'randomizeText'
Applies the following transformations on the input value:
- Applies the
randomizeText
converter on the username part. - Replaces the domain (if any) by a safe one.
For example, one of the possible conversions for "[email protected]" is "[email protected]".
Parameters:
Name | Required | Default | Description |
---|---|---|---|
domains | N | ['example.com', 'example.net', 'example.org'] |
A list of email domains. |
min_length | N | 3 |
The minimum length of the generated username (when not empty). |
replacements | N | Check here | A string that contains the replacement characters. |
Example:
tables:
my_table:
converters:
my_column:
converter: 'randomizeEmail'
Converts all numeric characters to random numbers. Other characters are not modified.
For example, one of the possible conversions for "number_123456" is "number_086714"
Example:
tables:
my_table:
converters:
my_column:
converter: 'randomizeNumber'
These converters anonymize an input value. Empty values are not converted.
Anonymizes string values by replacing all characters with the *
character.
The first letter of each word is preserved.
The default word separators are
(space), _
(underscore) and .
(dot).
For example, it converts "John Doe" to "J*** D**".
Parameters:
Name | Required | Default | Description |
---|---|---|---|
replacement | N | '*' |
The replacement character. |
delimiters | N | [' ', '_', '-', .'] |
The word separator characters. |
min_word_length | N | 3 |
The minimum length per anonymized word. Useful only if at least one word separator is defined. |
Example:
tables:
my_table:
converters:
my_column:
converter: 'anonymizeText'
Applies the following transformations on the input value:
- Applies the
anonymizeText
converter on the username part. - Replaces the domain (if any) by a safe one.
For example, one of the possible conversions for "[email protected]" is "u****@example.org".
Parameters:
Name | Required | Default | Description |
---|---|---|---|
domains | N | ['example.com', 'example.net', 'example.org'] |
A list of email domains. |
replacement | N | '*' |
The replacement character. |
delimiters | N | [' ', '_', '-', '.'] |
The word separator characters. |
min_word_length | N | 3 |
The minimum length per anonymized word. Useful only if at least one word delimiter is defined. |
Example:
tables:
my_table:
converters:
my_column:
converter: 'anonymizeEmail'
Anonymizes numeric values by replacing all numbers with the *
character.
The first digit of each number is preserved.
For example, it converts "user123" to "user1**".
Name | Required | Default | Description |
---|---|---|---|
replacement | N | '*' |
The replacement character. |
min_number_length | N | 1 |
The minimum length per anonymized number (when not empty). |
Example:
tables:
my_table:
converters:
my_column:
converter: 'anonymizeNumber'
Anonymizes date values. It can be used to anonymize a date of birth.
The day and month are randomized. The year is not changed. For example, one of the possible conversions for "1990-01-01" is "1990-11-25".
The date format of the input value MUST match the format
parameter, otherwise an exception is thrown.
Parameters:
Name | Required | Default | Description |
---|---|---|---|
format | N | 'Y-m-d' |
The date format. |
Example:
tables:
my_table:
converters:
my_column:
converter: 'anonymizeDate'
Same as anonymizeDate
, but the default value of the format parameter is Y-m-d H:i:s
instead of Y-m-d
.
Parameters:
Name | Required | Default | Description |
---|---|---|---|
format | N | 'Y-m-d H:i:s' |
The date format. |
Example:
tables:
my_table:
converters:
my_column:
converter: 'anonymizeDateTime'
These converters generate random values.
Generates a random text value.
Parameters:
Name | Required | Default | Description |
---|---|---|---|
min_length | N | 3 |
The minimum length of the generated value. |
max_length | N | 16 |
The minimum length of the generated value. |
characters | N | Check here | A string that contains the characters used to generate the value. |
Example:
tables:
my_table:
converters:
my_column:
converter: 'randomText'
parameters:
min_length: 0
max_length: 10
Generates a random email address.
The username part of the email is generated with the randomText
converter.
Parameters:
Name | Required | Default | Description |
---|---|---|---|
domains | N | ['example.com', 'example.net', 'example.org'] |
A list of email domains. |
min_length | N | 3 |
The minimum length of the username. |
max_length | N | 16 |
The minimum length of the username. |
characters | N | Check here | A string that contains the characters used to generate the username. |
Example:
tables:
my_table:
converters:
my_column:
converter: 'randomEmail'
Generates a random date (e.g. 2005-08-03
).
Parameters:
Name | Required | Default | Description |
---|---|---|---|
format | N | 'Y-m-d' |
The date format. |
min_year | N | 1900 |
The min year. If set to null , the min year is the current year. |
max_year | N | null |
The max year. If set to null , the max year is the current year. |
Example:
tables:
my_table:
converters:
my_column:
converter: 'randomDate'
parameters:
min_year: 2000
max_year: 2050
Same as randomDate
, but the default value of the format parameter is Y-m-d H:i:s
instead of Y-m-d
.
Parameters:
Name | Required | Default | Description |
---|---|---|---|
format | N | 'Y-m-d' |
The date format. |
min_year | N | 1900 |
The min year. If set to null , the min year is the current year. |
max_year | N | null |
The max year. If set to null , the max year is the current year. |
Example:
tables:
my_table:
converters:
my_column:
converter: 'randomDateTime'
parameters:
min_year: 2000
max_year: 2050
Generates a number between a min and a max value.
Parameters:
Name | Required | Default | Description |
---|---|---|---|
min | Y | The min value. | |
max | Y | The max value. |
Example:
tables:
my_table:
converters:
my_column:
converter: 'numberBetween'
parameters:
min: 0
max: 100
Converts all values to null
.
Example:
tables:
my_table:
converters:
my_column:
converter: 'setNull'
This converter always returns the same value.
Parameters:
Name | Required | Default | Description |
---|---|---|---|
value | Y | The value to set. It must be a scalar (string, int, float, boolean) or null. |
Example:
tables:
my_table:
converters:
my_column:
converter: 'setValue'
parameters:
value: 0
These converters apply transformations on the input value (e.g. converting to lower case). Empty values are not converted.
Converts all characters to lower case.
Example:
tables:
my_table:
converters:
my_column:
converter: 'toLower'
Converts all characters to upper case.
Example:
tables:
my_table:
converters:
my_column:
converter: 'toUpper'
This converter adds a prefix to every value.
For example, the value user1
is converted to test_user1
if the prefix is test_
.
Parameters:
Name | Required | Default | Description |
---|---|---|---|
value | Y | The value to prepend. |
Example:
tables:
my_table:
converters:
my_column:
converter: 'prependText'
parameters:
value: 'test_'
This converter adds a suffix to every value.
For example, the value user1
is converted to user1_test
if the suffix is _test
.
Parameters:
Name | Required | Default | Description |
---|---|---|---|
value | Y | The value to append. |
Example:
tables:
my_table:
converters:
my_column:
converter: 'appendText'
parameters:
value: '_test'
This converter replaces all occurrences of the search string with the replacement string.
Parameters:
Name | Required | Default | Description |
---|---|---|---|
search | Y | The text to replace. | |
replacement | N | '' | The replacement text. |
Example:
tables:
my_table:
converters:
my_column:
converter: 'replace'
parameters:
search: 'bar'
replacement: 'baz'
This converter performs a regular expression search and replace.
Parameters:
Name | Required | Default | Description |
---|---|---|---|
pattern | Y | The pattern to find. | |
replacement | N | '' | The replacement text. |
limit | N | -1 | The max number of replacements to perform. No limit if set to -1 (default value). |
Example:
tables:
my_table:
converters:
my_column:
converter: 'regexReplace'
parameters:
pattern: '/[0-9]+/'
replacement: '15'
This converter applies a hash algorithm on the value.
The default algorithm is sha1
.
Any algorithm returned by the function hash_algos can be used. Examples: md5, sha1, sha256, sha512, crc32.
Parameters:
Name | Required | Default | Description |
---|---|---|---|
algorithm | Y | 'sha1' |
The algorithm to use. |
Example:
tables:
my_table:
converters:
my_column:
converter: 'hash'
parameters:
algorithm: 'sha256'
Allows to use any formatter defined in the Faker library.
Parameters:
Name | Required | Default | Description |
---|---|---|---|
formatter | Y | The formatter name. | |
arguments | N | [] |
The formatter arguments. |
Example:
tables:
my_table:
converters:
my_column:
converter: 'faker'
parameters:
formatter: 'numberBetween'
arguments: [1, 100]
To use a formatter that requires the original value as an argument, you can use the {{value}}
placeholder:
tables:
my_table:
converters:
my_column:
converter: 'faker'
parameters:
formatter: 'shuffle'
arguments: ['{{value}}']
The faker locale can be set in the configuration file and defaults to en_US
.
This converter executes a list of converters.
Parameters:
Name | Required | Default | Description |
---|---|---|---|
converters | Y | A list of converter definitions. |
Example:
tables:
my_table:
converters:
my_column:
converter: 'chain'
parameters:
converters:
- converter: 'anonymizeText'
condition: '{{another_column}} == 0'
- converter: 'randomizeText'
condition: '{{another_column}} == 1'
If you need to override a chained converter defined in a parent config file, you must specify the key index. For example, to disable the 2nd converter of a chain:
tables:
my_table:
converters:
my_column:
parameters:
converters:
1:
disabled: true
This converter can be used to anonymize data that are stored in a JSON object.
Parameters:
Name | Required | Default | Description |
---|---|---|---|
converters | Y | A list of converter definitions. The key of each converter definition is the path to the value within the JSON object. |
For example, if the following JSON data is stored in a column:
{"customer":{"email":"[email protected]","username":"john.doe"}}
The following converter can be used:
tables:
my_table:
converters:
my_column:
converter: 'jsonData'
parameters:
converters:
customer.email:
converter: 'anonymizeEmail'
customer.username:
converter: 'anonymizeText'
Same as jsonData
converter, but works with serialized data instead.
The serialized data must be an array.
This converter returns a value from the $context
array passed to converters.
The context array contains the following data:
-
row_data
: an array containing the value of each column of the table row -
processed_data
: an array containing the values of the row that were transformed by a converter
Parameters:
Name | Required | Default | Description |
---|---|---|---|
key | Y | The key associated to the value to retrieve in the context array. |
Example:
tables:
my_table:
converters:
email:
converter: 'randomizeEmail'
email_lowercase:
converter: 'chain'
parameters:
converters:
- converter: 'fromContext'
parameters:
key: 'processed_data.email'
- converter: 'toLower'