Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SNOW-1300150:Automatic schema inference for CSV loading option unclear #1521

Merged
merged 11 commits into from
May 13, 2024

Conversation

sfc-gh-yuwang
Copy link
Collaborator

  1. Which Jira issue is this PR addressing? Make sure that there is an accompanying issue to your PR.

    Fixes SNOW-1300150

  2. Fill out the following pre-review checklist:

    • I am adding a new automated test(s) to verify correctness of my new code
    • I am adding new logging messages
    • I am adding a new telemetry message
    • I am adding new credentials
    • I am adding a new dependency
  3. Please describe how your code solves the related issue.

    Please write a short description of how your code change solves the related issue.

@sfc-gh-yuwang sfc-gh-yuwang requested a review from a team as a code owner May 6, 2024 20:42
@sfc-gh-yuwang
Copy link
Collaborator Author

In this PR, we only set default infer_schema as true when reading csv file. If we do this to json file, there is a breaking change

@@ -421,6 +418,7 @@ def csv(self, path: str) -> DataFrame:
schema_to_cast = [("$1", "C1")]
transformations = []
else:
self._cur_options["INFER_SCHEMA"] = False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this condition will already be true if we reach the else branch, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, if we reach here, user must have provide a schema, in that way, we should use what user provide and turn off infer_schema

CHANGELOG.md Outdated
Comment on lines 7 to 8
- Improved error message to remind users set `{"infer_schema":True}` when read csv file without specifying its schema.ß

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Improved error message to remind users set `{"infer_schema":True}` when read csv file without specifying its schema.ß
- Improved error message to remind users set `{"infer_schema": True}` when reading csv file without specifying its schema.

nit

@sfc-gh-yuwang sfc-gh-yuwang requested a review from sfc-gh-yixie May 9, 2024 16:32
src/snowflake/snowpark/_internal/error_message.py Outdated Show resolved Hide resolved
tests/integ/scala/test_dataframe_reader_suite.py Outdated Show resolved Hide resolved
tests/unit/scala/test_error_message.py Outdated Show resolved Hide resolved
@sfc-gh-yuwang sfc-gh-yuwang merged commit 75036a4 into main May 13, 2024
25 checks passed
@sfc-gh-yuwang sfc-gh-yuwang deleted the SNOW-1300150 branch May 13, 2024 16:40
@github-actions github-actions bot locked and limited conversation to collaborators May 13, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants