-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SNOW-958008: fix local test division datatypes #1127
SNOW-958008: fix local test division datatypes #1127
Conversation
src/snowflake/snowpark/types.py
Outdated
@@ -80,7 +80,7 @@ def __init__(self, length: Optional[int] = None) -> None: | |||
self.length = length | |||
|
|||
def __repr__(self) -> str: | |||
if self.length: | |||
if self.length and self.length != self._MAX_LENGTH: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sfc-gh-aalam Are you good with this change?
@@ -214,6 +214,8 @@ def __init__(self, value: Any, datatype: Optional[DataType] = None) -> None: | |||
self.datatype = datatype | |||
else: | |||
self.datatype = infer_type(value) | |||
if isinstance(self.datatype, StringType): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sfc-gh-aalam Are you good with this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes. approved the other PR which adds this change
@@ -946,15 +946,15 @@ def calculate_expression( | |||
) | |||
return ColumnEmulator( | |||
data=[bool(data is None) for data in child_column], | |||
sf_type=ColumnType(BooleanType(), False), | |||
sf_type=ColumnType(BooleanType(), True), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We observe that boolean type columns are always nullable from live connection.
@@ -989,6 +989,8 @@ def calculate_expression( | |||
if isinstance(exp, BinaryExpression): | |||
left = calculate_expression(exp.left, input_data, analyzer, expr_to_alias) | |||
right = calculate_expression(exp.right, input_data, analyzer, expr_to_alias) | |||
# TODO: Address mixed type calculation here. For instance Snowflake allows to add a date to a number, but |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be enabled by type-coercion.
@@ -182,7 +191,7 @@ def calculate_type(c1: ColumnType, c2: Optional[ColumnType], op: Union[str]): | |||
elif isinstance(t1, (FloatType, DoubleType)) or isinstance( | |||
t2, (FloatType, DoubleType) | |||
): | |||
return ColumnType(DoubleType(), nullable) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could switch to using normalize_output_sf_type
if isinstance(sf_type.datatype, DecimalType): | ||
result = result.astype("double").round(sf_type.datatype.scale) | ||
elif isinstance(sf_type.datatype, (FloatType, DoubleType)): | ||
result = result.astype("double").round(16) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We observe that Snowflake returns 16 digits after the floating point. cc: @sfc-gh-yixie
@@ -435,25 +446,28 @@ def __round__(self, n=None): | |||
|
|||
def __rpow__(self, other): | |||
result = super().__rpow__(other) | |||
result.sf_type = ColumnType(DoubleType(), self.sf_type.nullable) | |||
result.sf_type = ColumnType(FloatType(), True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment as Sophie's to use normalize_output_sf_type
return result | ||
|
||
def __pow__(self, power): | ||
result = super().__pow__(power) | ||
result.sf_type = ColumnType(DoubleType(), self.sf_type.nullable) | ||
result.sf_type = ColumnType( | ||
FloatType(), self.sf_type.nullable or power.sf_type.nullable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are all the FloatType
in the code usage ultimately be converted to DoubleType
via normalize_output_sf_type
at some point?
and from user perspective will they get a chance to see the FloatType
at all?
it seems the two types are interchangeable and we want DoubleType
to be in the final output. can we just use DoubleType everywhere?
The code changes LGTM, let's address the test failures in Test Local Testing Module macos merge gates. |
src/snowflake/snowpark/mock/plan.py
Outdated
@@ -1661,7 +1670,9 @@ def calculate_expression( | |||
).sf_type | |||
if not calculated_sf_type: | |||
calculated_sf_type = cur_windows_sf_type | |||
elif calculated_sf_type.datatype != cur_windows_sf_type.datatype: | |||
elif not is_datatype_compatible( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This basically allows coercion to happen but doesn't set the result type properly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i removed it
l2 = expr2.sf_type.datatype.length or StringType._MAX_LENGTH | ||
sf_data_type = StringType(max(l1, l2)) | ||
else: | ||
sf_data_type = ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we not using calculate_datatype
to compute the return type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. I finally wanted to limit the scope and fixed the problem locally.
Please answer these questions before submitting your pull requests. Thanks!
What GitHub issue is this PR addressing? Make sure that there is an accompanying issue to your PR.
Fixes #NNNN
Fill out the following pre-review checklist:
Please describe how your code solves the related issue.
Please write a short description of how your code change solves the related issue.