Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dagster-snowflake] config to determine how to handle timestamp data #13097

Merged
merged 19 commits into from
Apr 12, 2023

Conversation

jamiedemaria
Copy link
Contributor

@jamiedemaria jamiedemaria commented Mar 22, 2023

Summary & Motivation

There is an issue with storing pandas Timestamp values in snowflake where the year will get converted to a non-valid year (example 48399). In pr #8760 i got around this by storing pandas timestamps as strings, but i finally found this snowflakedb/snowflake-connector-python#319 that indicates that the real fix is to include timezone information in your pandas timestamps snowflakedb/snowflake-connector-python#319 (comment).

This PR adds a new configuration value to the snowflake io manager that allows the user to specify if they want to convert timestamp data to strings, or add a timezone to it. We also check the column types of the table to ensure that the chosen conversion aligns with the type of the corresponding table.

Right now I have deprecation warnings in place so that we can remove the string conversion at some point. I'm not sure what version to choose as the deprecation version, or if committing to deprecating makes sense right now.

I also added documentation explaining the behavior of the configuration value and providing SQL commands that will allow a user to migrate a column from string to timestamp if they wish to move from converting timestamp data to strings to using timezones.

See #12190 for additional context

How I Tested These Changes

@vercel
Copy link

vercel bot commented Mar 22, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

2 Ignored Deployments
Name Status Preview Comments Updated
dagit-storybook ⬜️ Ignored (Inspect) Mar 22, 2023 at 8:07PM (UTC)
dagster ⬜️ Ignored (Inspect) Visit Preview Mar 22, 2023 at 8:07PM (UTC)

@jamiedemaria
Copy link
Contributor Author

Current dependencies on/for this PR:

This comment was auto-generated by Graphite.

Copy link
Contributor

@erinkcochran87 erinkcochran87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Content-wise, this looks good to me 👍

docs/content/integrations/snowflake/reference.mdx Outdated Show resolved Hide resolved
@jamiedemaria jamiedemaria force-pushed the jamie/snowflake-pandas/convert-timezone branch from c6f515a to 74458b4 Compare March 27, 2023 15:15
@github-actions
Copy link

github-actions bot commented Mar 27, 2023

Deploy preview for dagster ready!

Preview available at https://dagster-cq7fphxcv-elementl.vercel.app

Direct link to changed pages:

@jamiedemaria
Copy link
Contributor Author

jamiedemaria commented Apr 3, 2023

@benpankow @sryza the last thing to decide for this PR is what version to list as the deprecation version for converting to strings. some options:

dagster-snowflake 0.20.0 (or 0.21.0 etc)
dagster-snowflake 1.0.0
dagster 2.0.0
punt on a deprecation warning for now and add one in at another time

@jamiedemaria jamiedemaria force-pushed the jamie/snowflake-pandas/convert-timezone branch from 4391816 to fd3ec99 Compare April 10, 2023 15:38
@github-actions
Copy link

github-actions bot commented Apr 10, 2023

Deploy preview for dagit-core-storybook ready!

✅ Preview
https://dagit-core-storybook-k12731ikr-elementl.vercel.app

Built with commit 1251aea.
This pull request is being automatically deployed with vercel-action

@jamiedemaria
Copy link
Contributor Author

@benpankow @sryza I'd like to get this in for the 1.3 code freeze (so ideally merge tomorrow afternoon). I'm getting buildkite passing after some nasty merge conflicts, could you do a final review pass and let me know if you have opinions on a deprecation date? I lean toward punting or doing dagster-snowflake=1.0.0 but that's more of a gut feeling, so i'm happy to go another direction

@sryza
Copy link
Contributor

sryza commented Apr 10, 2023

I think we can punt on specifying a deprecation version

Copy link
Contributor

@sryza sryza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One last thing: I think it would be helpful for the errors to include the names of the tables. Otherwise, this looks great.

@jamiedemaria jamiedemaria force-pushed the jamie/snowflake-pandas/convert-timezone branch from 2378f8a to 1251aea Compare April 11, 2023 18:08
@github-actions
Copy link

Deploy preview for dagit-storybook ready!

✅ Preview
https://dagit-storybook-l47hahcxo-elementl.vercel.app

Built with commit 1251aea.
This pull request is being automatically deployed with vercel-action

@jamiedemaria jamiedemaria merged commit c1bb754 into master Apr 12, 2023
@jamiedemaria jamiedemaria deleted the jamie/snowflake-pandas/convert-timezone branch April 12, 2023 17:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants