Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle datetime selection in cuDF interface #6407

Merged
merged 4 commits into from
Oct 15, 2024
Merged

Handle datetime selection in cuDF interface #6407

merged 4 commits into from
Oct 15, 2024

Conversation

philippjfr
Copy link
Member

cuDF does not properly support datetime comparisons so for now we have to cast the datetimes to integers to safely do the comparisons.

Copy link

codecov bot commented Oct 9, 2024

Codecov Report

Attention: Patch coverage is 0% with 1 line in your changes missing coverage. Please review.

Project coverage is 88.50%. Comparing base (f9c7f05) to head (f71ec45).
Report is 3 commits behind head on main.

Files with missing lines Patch % Lines
holoviews/core/data/cudf.py 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6407      +/-   ##
==========================================
+ Coverage   88.48%   88.50%   +0.01%     
==========================================
  Files         323      323              
  Lines       68469    68531      +62     
==========================================
+ Hits        60588    60650      +62     
  Misses       7881     7881              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@philippjfr philippjfr added the type: bug Something isn't correct or isn't working label Oct 9, 2024
@hoxbro
Copy link
Member

hoxbro commented Oct 11, 2024

I'm not sure I follow why it does not support datetime selection:

image

import cudf
from datetime import datetime

cdf = cudf.DataFrame(cudf.date_range("2020-01-01", "2021-01-01")) 

cdf < datetime(2020, 2, 1)

Though, I can see this fails: cdf.values with NotImplementedError: DateTime Arrays is not yet implemented in cudf

@philippjfr
Copy link
Member Author

Hmm, internally we do perform the comparisons on the underlying arrays I think. Here we may have to change that.

@philippjfr
Copy link
Member Author

Here's the traceback I'm seeing:

  File "/home/philippjfr/development/holoviews/holoviews/core/data/cudf.py", line 263, in select
    selection_mask = cls.select_mask(dataset, selection)
  File "/home/philippjfr/development/holoviews/holoviews/core/data/cudf.py", line 213, in select_mask
    new_masks.append(start <= arr)
  File "/home/philippjfr/miniconda3/envs/cuda_12.3/lib/python3.10/site-packages/cudf/core/indexed_frame.py", line 3500, in __array_ufunc__
    ret = super().__array_ufunc__(ufunc, method, *inputs, **kwargs)
  File "/home/philippjfr/miniconda3/envs/cuda_12.3/lib/python3.10/site-packages/cudf/core/frame.py", line 1772, in __array_ufunc__
    return _array_ufunc(self, ufunc, method, inputs, kwargs)
  File "/home/philippjfr/miniconda3/envs/cuda_12.3/lib/python3.10/site-packages/cudf/utils/utils.py", line 93, in _array_ufunc
    return getattr(obj, op)(other)
  File "/home/philippjfr/miniconda3/envs/cuda_12.3/lib/python3.10/site-packages/cudf/core/mixins/mixin_factory.py", line 11, in wrapper
    return method(self, *args1, *args2, **kwargs1, **kwargs2)
  File "/home/philippjfr/miniconda3/envs/cuda_12.3/lib/python3.10/site-packages/cudf/core/indexed_frame.py", line 3475, in _binaryop
    ColumnAccessor(type(self)._colwise_binop(operands, op)),
  File "/home/philippjfr/miniconda3/envs/cuda_12.3/lib/python3.10/site-packages/nvtx/nvtx.py", line 116, in inner
    result = func(*args, **kwargs)
  File "/home/philippjfr/miniconda3/envs/cuda_12.3/lib/python3.10/site-packages/cudf/core/frame.py", line 1761, in _colwise_binop
    else getattr(operator, fn)(left_column, right_column)
TypeError: operand type(s) all returned NotImplemented from __array_ufunc__(<ufunc 'less_equal'>, '__call__', array(1709576447014621020), <cudf.core.column.datetime.DatetimeColumn object at 0x7f90ac5796c0>
[
  2024-03-04 16:59:57.899934720,
  2024-03-04 16:59:57.919931904,
  2024-03-04 16:59:57.939934720,
  2024-03-04 16:59:57.959939840,
  2024-03-04 16:59:57.979941376,
  2024-03-04 16:59:57.999938048,
  2024-03-04 16:59:58.019941376,
  2024-03-04 16:59:58.039938816,
  2024-03-04 16:59:58.059934976,
  2024-03-04 16:59:58.079938816,
  ...
  2024-03-04 22:05:02.024025856,
  2024-03-04 22:05:02.044026368,
  2024-03-04 22:05:02.064026880,
  2024-03-04 22:05:02.084027392,
  2024-03-04 22:05:02.104026880,
  2024-03-04 22:05:02.124027136,
  2024-03-04 22:05:02.144026368,
  2024-03-04 22:05:02.164026368,
  2024-03-04 22:05:02.184026368,
  2024-03-04 16:59:57.899934720
]
dtype: datetime64[ns]): 'ndarray', 'DatetimeColumn'

@philippjfr
Copy link
Member Author

Actually, sorry, the traceback is this:

  File "/home/philippjfr/development/holoviews/holoviews/core/data/cudf.py", line 261, in select
    selection_mask = cls.select_mask(dataset, selection)
  File "/home/philippjfr/development/holoviews/holoviews/core/data/cudf.py", line 212, in select_mask
    new_masks.append(start <= arr)
  File "/home/philippjfr/miniconda3/envs/cuda_12.3/lib/python3.10/site-packages/cudf/core/indexed_frame.py", line 3500, in __array_ufunc__
    ret = super().__array_ufunc__(ufunc, method, *inputs, **kwargs)
  File "/home/philippjfr/miniconda3/envs/cuda_12.3/lib/python3.10/site-packages/cudf/core/frame.py", line 1772, in __array_ufunc__
    return _array_ufunc(self, ufunc, method, inputs, kwargs)
  File "/home/philippjfr/miniconda3/envs/cuda_12.3/lib/python3.10/site-packages/cudf/utils/utils.py", line 93, in _array_ufunc
    return getattr(obj, op)(other)
  File "/home/philippjfr/miniconda3/envs/cuda_12.3/lib/python3.10/site-packages/cudf/core/mixins/mixin_factory.py", line 11, in wrapper
    return method(self, *args1, *args2, **kwargs1, **kwargs2)
  File "/home/philippjfr/miniconda3/envs/cuda_12.3/lib/python3.10/site-packages/cudf/core/indexed_frame.py", line 3475, in _binaryop
    ColumnAccessor(type(self)._colwise_binop(operands, op)),
  File "/home/philippjfr/miniconda3/envs/cuda_12.3/lib/python3.10/site-packages/nvtx/nvtx.py", line 116, in inner
    result = func(*args, **kwargs)
  File "/home/philippjfr/miniconda3/envs/cuda_12.3/lib/python3.10/site-packages/cudf/core/frame.py", line 1761, in _colwise_binop
    else getattr(operator, fn)(left_column, right_column)
  File "/home/philippjfr/miniconda3/envs/cuda_12.3/lib/python3.10/site-packages/cudf/core/mixins/mixin_factory.py", line 11, in wrapper
    return method(self, *args1, *args2, **kwargs1, **kwargs2)
  File "/home/philippjfr/miniconda3/envs/cuda_12.3/lib/python3.10/site-packages/cudf/core/column/datetime.py", line 405, in _binaryop
    other = self._wrap_binop_normalization(other)
  File "/home/philippjfr/miniconda3/envs/cuda_12.3/lib/python3.10/site-packages/cudf/core/column/column.py", line 609, in _wrap_binop_normalization
    other = other.dtype.type(other.item())
ValueError: Converting an integer to a NumPy datetime requires a specified unit

@hoxbro
Copy link
Member

hoxbro commented Oct 14, 2024

What is your cudf version?

@philippjfr
Copy link
Member Author

Also thought it was that, but just tested with 2024.10.1 and still see the first error I posted.

@hoxbro
Copy link
Member

hoxbro commented Oct 15, 2024

I'm having a hard time creating a DatetimeColumn. Would you happen to have a small example?

@philippjfr
Copy link
Member Author

philippjfr commented Oct 15, 2024

It's ordering dependent!

cudf.Series < np.datetime64 works.

np.datetime64 > cudf.Series does not.

I can just flip the conditions.

@philippjfr
Copy link
Member Author

Reproducer

np.datetime64('2024-03-04T18:24:35.670870310') < cudf.Series([np.datetime64('2024-03-04T18:24:35.67')])

holoviews/core/data/cudf.py Outdated Show resolved Hide resolved
@hoxbro hoxbro enabled auto-merge (squash) October 15, 2024 10:47
@hoxbro hoxbro merged commit a3abfcd into main Oct 15, 2024
14 checks passed
@hoxbro hoxbro deleted the cudf_dt_selection branch October 15, 2024 10:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug Something isn't correct or isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants