Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FIX] pandas_compat: fix conversion of datetime series #5547

Merged

Conversation

PrimozGodec
Copy link
Contributor

@PrimozGodec PrimozGodec commented Aug 10, 2021

Issue
  • Currently, dates and dimes are converted from pd.DataFrame to the Table via string date-time representation. Conversion between frame and table fails since some string formats that Pandas produce are not compatible with datetime.strftime and specified formats (e.g. pandas output time on nanosecond precision while DateTime only support reading of milliseconds precision max, and some other cases). It could be solvable via some string manipulation which is heavily error prune.
  • Current conversion is slow (20 s for 1M dates).
Description of changes
  • New implementation of the conversion that does not use string
  • Speedup (now 0.5 s for 1M dates)
  • Deprecating TimeVariable.utc_offset since it is only used in one place in Orange3-timeseries and can be implemented using timezone attribute. Storing utc_offset in TimeVariable is also not a good practice since two dates in the same timezone can have two different offsets depending on summer, wintertime or time zones changes in history. Remembering the timezone is more suitable.
Includes
  • Code changes
  • Tests
  • Documentation

@PrimozGodec PrimozGodec force-pushed the pandas_compat-fix-time-var-coversion branch 2 times, most recently from 25b7a48 to 84ad9e8 Compare August 10, 2021 13:42
@PrimozGodec PrimozGodec mentioned this pull request Aug 10, 2021
3 tasks
@PrimozGodec PrimozGodec force-pushed the pandas_compat-fix-time-var-coversion branch from 84ad9e8 to b2c7c75 Compare August 10, 2021 14:55
@PrimozGodec PrimozGodec changed the title pandas_compat: fix conversion of datetime series [FIX] pandas_compat: fix conversion of datetime series Aug 10, 2021
@PrimozGodec PrimozGodec force-pushed the pandas_compat-fix-time-var-coversion branch from 54d7414 to 2aa365a Compare August 11, 2021 08:50
@codecov
Copy link

codecov bot commented Aug 11, 2021

Codecov Report

Merging #5547 (672de2f) into master (77f53fe) will decrease coverage by 0.47%.
The diff coverage is 94.44%.

@@            Coverage Diff             @@
##           master    #5547      +/-   ##
==========================================
- Coverage   86.38%   85.90%   -0.48%     
==========================================
  Files         304      313       +9     
  Lines       61801    65353    +3552     
==========================================
+ Hits        53386    56143    +2757     
- Misses       8415     9210     +795     

if (dt_nonnat.dt.floor("d") == dt_nonnat).all():
# all times are 00:00:00.0 - pure date
return 1, 0
elif (dt_nonnat.dt.date == pd.Timestamp("now").date()).all():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does pandas automatically assume today's date if only time is present?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep. When pd.datetime() is called on column of times it will be transformed to datetime64 and all dates will be today's dates

@PrimozGodec PrimozGodec marked this pull request as draft August 13, 2021 08:10
@PrimozGodec PrimozGodec force-pushed the pandas_compat-fix-time-var-coversion branch 2 times, most recently from 6bb5083 to b177f81 Compare August 19, 2021 07:55
@PrimozGodec PrimozGodec marked this pull request as ready for review August 19, 2021 07:56
@PrimozGodec PrimozGodec force-pushed the pandas_compat-fix-time-var-coversion branch from b177f81 to 4ffed33 Compare August 19, 2021 08:01
@PrimozGodec PrimozGodec force-pushed the pandas_compat-fix-time-var-coversion branch from 4ffed33 to 672de2f Compare August 26, 2021 13:54
@lanzagar lanzagar merged commit 79ac616 into biolab:master Aug 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants