Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exclude numeric data type mismatches from casewhen type checking #1733

Closed
wants to merge 1 commit into from

Conversation

chrispattenslalom
Copy link

@chrispattenslalom chrispattenslalom commented Jun 4, 2024

  1. Which Jira issue is this PR addressing? Make sure that there is an accompanying issue to your PR.

No Jira issue created. This PR was opened per a conversation with @sfc-gh-jrose.

  1. Fill out the following pre-review checklist:

    • I am adding a new automated test(s) to verify correctness of my new code
      • If this test skips Local Testing mode, I'm requesting review from @snowflakedb/local-testing
    • I am adding new logging messages
    • I am adding a new telemetry message
    • I am adding new credentials
    • I am adding a new dependency
    • If this is a new feature/behavior, I'm adding the Local Testing parity changes.
  2. Please describe how your code solves the related issue.

Resolves an issue with Local Testing mode where numeric data types used in a CaseWhen expression return an error if there is a precision mismatch.

Copy link

github-actions bot commented Jun 4, 2024

All contributors have signed the CLA ✍️ ✅
Posted by the CLA Assistant Lite bot.

@chrispattenslalom
Copy link
Author

I have read the CLA Document and I hereby sign the CLA

@chrispattenslalom chrispattenslalom marked this pull request as ready for review June 4, 2024 20:26
@chrispattenslalom chrispattenslalom requested a review from a team as a code owner June 4, 2024 20:26
@sfc-gh-jrose sfc-gh-jrose requested review from sfc-gh-stan, sfc-gh-aling and a team and removed request for sfc-gh-yuwang and sfc-gh-aalam June 4, 2024 20:27
@sfc-gh-jrose
Copy link
Contributor

Thanks for the PR Chris.

For reviewers, here is a repro for the issue that this PR is addressing:

session = Session.builder.config("local_testing", True).create()

dataframe = session.create_dataframe(
    data=[
        ["value1", 1.23],
        ["value1", 1.23],
        ["value2", 4.56],
        ["value2", 7.89],
    ],
    schema=StructType(
        [StructField("col1", StringType()), StructField("col2", DecimalType(38, 2))]
    ),
)

dataframe2 = session.create_dataframe(
    data=[["value1", 5.0]],
    schema=StructType(
        [StructField("col3", StringType()), StructField("col4", DecimalType(38, 6))]
    ),
)

dataframe.join(
    right=dataframe2, on=dataframe.col1 == dataframe2.col3, how="left"
).select(
    dataframe.col1,
    dataframe.col2,
    dataframe2.col3,
    dataframe2.col4,
    when(dataframe2.col4.is_null(), dataframe.col2)
    .otherwise(dataframe2.col4)
    .alias("new_col"),
).show()

@sfc-gh-jrose
Copy link
Contributor

Hey Chris,
I'm going to go ahead and close this PR. We merged a PR yesterday that adds support for type coercion which has changed both code locations touched in this PR. I've run your example against the new code paths and it looks like it handles it correctly now. Your change was helpful in identifying edge cases in that new code so thank you for taking the time to open this PR.

@github-actions github-actions bot locked and limited conversation to collaborators Jun 13, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants