Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SIG-45543] return microsecond rather than nanosec in arrowToRecord() for timestamp tz/ntz/ltz to avoid future date overflow #180

Merged
merged 4 commits into from
Nov 7, 2023

Conversation

Yifeng-Sigma
Copy link

@Yifeng-Sigma Yifeng-Sigma commented Nov 6, 2023

Description

snowflakedb#910
The last piece that blocks arrow batch.
It's the "Year 2262" problem.
The root cause is the int64, which is the underlying data representation of arrow.Timestamp() is unable to support date beyond 2262 in nanosecond.
timestampNtzType, timestampLtzType, timestampTzType from snowflake can return a future date that can't be represented by int64 in nanosecond.

Since in Sigma system we use millisecond for now and there's microsecond WIP, we don't have any plans to support nanosecond. We can just transport microsecond in arrow batch to bypass this problem.

This only affects arrow batch path and doesn't have impact on production code.
I'll also make exporter/evaluator changes.

Question You Might Ask

Why we don't we use upstream fix?
There is no "real fix" to the problem. What upstream does is to add a validation check and return an error if overflow. However since we don't need nanosecond, we can return microsecond and bypass this year 2262 problem at all.

Checklist

  • Code compiles correctly
  • Run make fmt to fix inconsistent formats
  • Run make lint to get lint errors and fix all of them
  • Created tests which fail without the change (if possible)
  • All tests passing
  • Extended the README / documentation, if necessary

@Yifeng-Sigma Yifeng-Sigma force-pushed the yifeng/fix_sf_future_timestamp branch from 308f71d to 96f9c60 Compare November 6, 2023 22:29
@Yifeng-Sigma Yifeng-Sigma requested review from rajsigma and a team November 7, 2023 01:05
@Yifeng-Sigma Yifeng-Sigma changed the title [SIG-45543] return microsecond rather than nanosec in arrowToRecord() to avoid future date overflow [SIG-45543] return microsecond rather than nanosec in arrowToRecord() for timestamp tz/ntz/ltz to avoid future date overflow Nov 7, 2023
Copy link
Collaborator

@madisonchamberlain madisonchamberlain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM I will keep this change in mind for what is needed for to work on to have more parity with the non forked version

@Yifeng-Sigma
Copy link
Author

LGTM I will keep this change in mind for what is needed for to work on to have more parity with the non forked version

Thanks! We can chat more when you start the new connector work. Happy to help on the arrow batch part.

@Yifeng-Sigma Yifeng-Sigma merged commit 7bee10a into master Nov 7, 2023
3 of 4 checks passed
@Yifeng-Sigma Yifeng-Sigma deleted the yifeng/fix_sf_future_timestamp branch November 7, 2023 21:49
@github-actions github-actions bot locked and limited conversation to collaborators Nov 7, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants