You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Trying to create a synthetic dataset from the Kaggle adult census dataset (with the fnlwgt column removed) in the correlated attribute mode results in the generator failing to parse the description file.
This resolves int types as np.int64(0) instead of just 0 for parents > 1 . This in turn causes L99 of the DataGenerator to fail, as it does not import numpy:
parents_instance=list(eval(parents_instance))
I could fix it locally by simply adding import numpy as np to the DataGenerator.py file, but maybe it would be cleaner to correctly print the base int type into the description file in the first place.
Traceback (most recent call last):
File "D:\...\helpers\generate_main.py", line 27, in <module>
main()
File "D:\...\helpers\generate_main.py", line 21, in main
generator.generate(rows)
File "D:\...\generators\priv_bayes_generator.py", line 35, in generate
generator.generate_dataset_in_correlated_attribute_mode(num_tuples_to_generate, description_file)
File "D:\...\venv3.9\lib\site-packages\DataSynthesizer\DataGenerator.py", line 66, in generate_dataset_in_correlated_attribute_mode
self.encoded_dataset = DataGenerator.generate_encoded_dataset(self.n, self.description)
File "D:\...\venv3.9\lib\site-packages\DataSynthesizer\DataGenerator.py", line 100, in generate_encoded_dataset
parents_instance = list(eval(parents_instance))
File "<string>", line 1, in <module>
NameError: name 'np' is not defined
The text was updated successfully, but these errors were encountered:
Description
Trying to create a synthetic dataset from the Kaggle adult census dataset (with the
fnlwgt
column removed) in the correlated attribute mode results in the generator failing to parse the description file.The reason for this seems to be in L281 of PrivBayes.py:
This resolves int types as
np.int64(0)
instead of just0
forparents > 1
. This in turn causes L99 of the DataGenerator to fail, as it does not import numpy:I could fix it locally by simply adding
import numpy as np
to theDataGenerator.py
file, but maybe it would be cleaner to correctly print the base int type into the description file in the first place.The relevant section of the description file:
What I Did
Python script:
Traceback:
The text was updated successfully, but these errors were encountered: