-
Notifications
You must be signed in to change notification settings - Fork 8
Convert Fake data to CDM
Anton Ivanov edited this page Apr 27, 2021
·
3 revisions
This feature allows one to create a fake dataset based on a scan report. The generated fake data can be used as source dataset for CDM conversion.
- If no values have been scanned (i.e. the column in the scan report doesn’t contain values), Perseus will generate random strings or numbers for that column.
- If there are values scanned, Perseus will generate the data by choosing from the scan values. Values are sampled either based on the frequencies of the values, or sampled uniformly (if this option selected).
- If the column only contains unique values (each value has a frequency of 1, e.g. for primary keys), the generated column will be kept unique.
Max rows per table sets the number of rows of each output table. By default, it is set to 10,000. By checking the Uniform Sampling box will generate the fake data uniformly. The frequency of each of the values will be treated as being 1, but the value sampling will still be random. This increases the chance that each of the values in the scan report is at least once represented in the output data.