You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Just to add a little more, I think part of the inconsistent/confusing behavior is if you take a series that has numeric values, but not a category dtype, and initialize with the PostalCode logical type, the numeric values get converted to strings:
>>> ser = pd.Series([12345, 67890])
>>> ser = ww.init_series(ser, logical_type='PostalCode')
>>> type(ser[0])
<class 'str'>
But if you start with the same values and set the type as category before WW init, you end up with numeric values instead of strings:
>>> ser = pd.Series([12345, 67890]).astype("category")
>>> ser = ww.init_series(ser, logical_type='PostalCode')
>>> type(ser[0])
<class 'numpy.int64'>
I believe WW should provide a consistent output in this case, so that no matter the input dtype type we have the same type used in the output after WW initialization.
Series with PostalCode logical type can have
float
orstr
elements.For example,
In the above code block, the elements of the series are floats, but in the following, they are strings:
Both are valid initializations. We should decide whether we want to support both data types for the PostalCode logical type.
This issue was discussed here. alteryx/featuretools#2365
The text was updated successfully, but these errors were encountered: