Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MNT] Make aeon numpy compatible with both =>2.0 and <2.0 #2216

Merged
merged 47 commits into from
Oct 21, 2024
Merged

Conversation

TonyBagnall
Copy link
Contributor

@TonyBagnall TonyBagnall commented Oct 16, 2024

Reference Issues/PRs

closes #2218

test numpy2 compliance over simple fixes

STRAY np.Inf -> np.inf
Catch22: change list to array prior to call to nan_to_num

        c22_array = np.array(c22_list)
        if self.replace_nans:
            c22_array = np.nan_to_num(c22_array, False, 0, 0, 0)

performance_metrics.forecasting._functions

there is a difference in docstring tests: output of scalars appears as np.float64(x) not just x.

I have stripped out the expected output. Under numpy 2 it expects np.float64(0.0003), but that does not work with earlier numpy versions, where just 0.0003 is expected. the numbers are exactly the same

@TonyBagnall TonyBagnall added the maintenance Continuous integration, unit testing & package distribution label Oct 16, 2024
@aeon-actions-bot
Copy link
Contributor

aeon-actions-bot bot commented Oct 16, 2024

Thank you for contributing to aeon

I did not find any labels to add based on the title. Please add the [ENH], [MNT], [BUG], [DOC], [REF], [DEP] and/or [GOV] tags to your pull requests titles. For now you can add the labels manually.

The Checks tab will show the status of our automated tests. You can click on individual test runs in the tab or "Details" in the panel below to see more information if there is a failure.

If our pre-commit code quality check fails, any trivial fixes will automatically be pushed to your PR unless it is a draft.

Don't hesitate to ask questions on the aeon Slack channel if you have any.

PR CI actions

These checkboxes will add labels to enable/disable CI functionality for this PR. This may not take effect immediately, and a new commit may be required to run the new configuration.

  • Run pre-commit checks for all files
  • Run mypy typecheck tests
  • Run all pytest tests and configurations
  • Run all notebook example tests
  • Run numba-disabled codecov tests
  • Stop automatic pre-commit fixes (always disabled for drafts)
  • Disable numba cache loading
  • Push an empty commit to re-run CI checks

@TonyBagnall TonyBagnall changed the title Ajb/numpy2 [MNT] Remove numpy bound for 2.0 test Oct 16, 2024
@TonyBagnall TonyBagnall added full pytest actions Run the full pytest suite on a PR codecov actions Run the codecov action on a PR labels Oct 16, 2024
@aeon-actions-bot aeon-actions-bot bot removed full pytest actions Run the full pytest suite on a PR codecov actions Run the codecov action on a PR labels Oct 16, 2024
@TonyBagnall TonyBagnall added full pytest actions Run the full pytest suite on a PR codecov actions Run the codecov action on a PR labels Oct 16, 2024
@TonyBagnall
Copy link
Contributor Author

ok so isolated another weird difference arising in the test function test_stomp_squared_matrix_profile

this code

import numpy as np
expected = np.array([[18.00000,9.00000,6.00000,9.00000,18.00000,33.00000],
[19.00000,6.00000,5.00000,9.00000,6.00000,5.00000]])
id_bests = np.vstack(
    np.unravel_index(np.argsort(expected.ravel()), expected.shape)
).T

print(id_bests)

produces different output under numpy 1.26 and numpy 2.0. I assume due to tie breaking
1.26 output
[[1 2]
[1 5]
[0 2]
[1 1]
[1 4]
[0 1]
[0 3]
[1 3]
[0 0]
[0 4]
[1 0]
[0 5]]

2.0 output
[[1 5]
[1 2]
[0 2]
[1 1]
[1 4]
[1 3]
[0 3]
[0 1]
[0 4]
[0 0]
[1 0]
[0 5]]

@TonyBagnall
Copy link
Contributor Author

TonyBagnall commented Oct 16, 2024

Reminds me of the shapelet argsort issue under numba/no numba

see #622

@TonyBagnall
Copy link
Contributor Author

TonyBagnall commented Oct 16, 2024

simplified further

x= [18.00000,9.00000,6.00000,9.00000,18.00000,33.00000,19.00000,6.00000,5.00000,
    9.00000,6.00000,5.00000]
args = np.argsort(x)

gives different results in args under differnt numpy versions (and previously with and without numba #622). This is a very minor issue in one test, but we do use argsort quite a lot, might be time to implement our own version ...

image

@TonyBagnall
Copy link
Contributor Author

ah they have added a "stable" argument! Problem solved :)
https://numpy.org/doc/2.0/reference/generated/numpy.argsort.html

Copy link
Member

@MatthewMiddlehurst MatthewMiddlehurst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cant say I'm a fan of just removing all the expected outputs

@TonyBagnall
Copy link
Contributor Author

Cant say I'm a fan of just removing all the expected outputs

if you dont you cannot be backward compatible as far as I can tell

@aeon-actions-bot aeon-actions-bot bot added the full examples run Run all examples on a PR label Oct 17, 2024
@SebastianSchmidl
Copy link
Member

SebastianSchmidl commented Oct 17, 2024

The removed outputs are mostly in the performance metrics module. IMO, it has lower priority, so I don't care as much.

The problem is that the examples are checked using a string-comparison, right? Because np.float64(3.2) == 3.2 should return True:

  • NumPy 1.x
    image

  • NumPy 2.x
    image

@TonyBagnall
Copy link
Contributor Author

The removed outputs are mostly in the performance metrics module. IMO, it has lower priority, so I don't care as much.

The problem is that the examples are checked using a string-comparison, right? Because np.float64(3.2) == 3.2 should return True.

agree, I think we skip ones we care about and just dont print of the performance metric module. I dont know how docstring runs the comparison, not really motivated to find out

@TonyBagnall
Copy link
Contributor Author

aeon\similarity_search\matrix_profiles\tests\test_stomp.py:45 (test_stomp_squared_matrix_profile[3-int64])
dtype = 'int64', k = 3

    @pytest.mark.parametrize("dtype", DATATYPES)
    @pytest.mark.parametrize("k", K_VALUES)
    def test_stomp_squared_matrix_profile(dtype, k):
        """Test naive series search."""
        X = np.asarray(
            [[[1, 2, 3, 4, 5, 6, 7, 8]], [[1, 2, 4, 4, 5, 6, 5, 4]]], dtype=dtype
        )
    
        S = np.asarray([[3, 4, 5, 4, 3, 4, 5, 3, 2, 4, 5]], dtype=dtype)
        L = 3
        mask = np.ones((X.shape[0], X.shape[2] - L + 1), dtype=bool)
        distance = get_distance_function("squared")
        mp, ip = stomp_squared_matrix_profile(X, S, L, mask, k=k)
        for i in range(S.shape[-1] - L + 1):
            q = S[:, i : i + L]
    
            expected = np.array(
                [
                    [distance(q, X[j, :, _i : _i + L]) for _i in range(X.shape[-1] - L + 1)]
                    for j in range(X.shape[0])
                ]
            )
            id_bests = np.vstack(
                np.unravel_index(
                    np.argsort(expected.ravel(), kind="stable"), expected.shape
                )
            ).T
    
            for j in range(k):
                assert_almost_equal(mp[i][j], expected[id_bests[j, 0], id_bests[j, 1]])
>               assert_equal(ip[i][j], id_bests[j])

L          = 3
S          = array([[3, 4, 5, 4, 3, 4, 5, 3, 2, 4, 5]])
X          = array([[[1, 2, 3, 4, 5, 6, 7, 8]],

       [[1, 2, 4, 4, 5, 6, 5, 4]]])
distance   = CPUDispatcher(<function squared_distance at 0x000002D04F561820>)
dtype      = 'int64'
expected   = array([[20., 11.,  8., 11., 20., 35.],
       [21., 10.,  5., 11.,  8.,  3.]])
i          = 2
id_bests   = array([[1, 5],
       [1, 2],
       [0, 2],
       [1, 4],
       [1, 1],
       [0, 1],
       [0, 3],
       [1, 3],
       [0, 0],
       [0, 4],
       [1, 0],
       [0, 5]])
ip         = array([array([[0, 2],
              [1, 2],
              [1, 1]]), array([[1, 2],
                               [0, 2],
                               [1, 4]]), array([[1, 5],
                                                [1, 2],
                                                [1, 4]]), array([[1, 2],
                                                                 [0, 2],
                                                                 [0, 1]]),
       array([[0, 2],
              [1, 2],
              [1, 1]]), array([[1, 2],
                               [1, 5],
                               [1, 1]]), array([[1, 5],
                                                [1, 2],
                                                [0, 1]]), array([[0, 1],
                                                                 [1, 0],
                                                                 [0, 0]]),
       array([[0, 2],
              [1, 1],
              [0, 1]])], dtype=object)
j          = 2
k          = 3
mask       = array([[ True,  True,  True,  True,  True,  True],
       [ True,  True,  True,  True,  True,  True]])
mp         = array([array([0., 1., 2.]), array([2., 3., 3.]), array([3., 5., 8.]),
       array([2., 3., 4.]), array([0., 1., 2.]), array([5., 5., 6.]),
       array([ 9., 11., 13.]), array([2., 4., 5.]), array([1., 1., 2.])],
      dtype=object)
q          = array([[5, 4, 3]])

test_stomp.py:76: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
..\..\..\..\venv2\lib\site-packages\numpy\_utils\__init__.py:85: in wrapper
    return fun(*args, **kwargs)
        args       = (array([1, 4]), array([0, 2]), '', True)
        dep_version = '2.0.0'
        fun        = <function assert_array_equal at 0x000002D04D36CD30>
        kwargs     = {'strict': False}
        new_name   = 'desired'
        new_names  = ['actual', 'desired']
        old_name   = 'y'
        old_names  = ['x', 'y']
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

args = (<built-in function eq>, array([1, 4]), array([0, 2]))
kwds = {'err_msg': '', 'header': 'Arrays are not equal', 'strict': False, 'verbose': True}

    @wraps(func)
    def inner(*args, **kwds):
        with self._recreate_cm():
>           return func(*args, **kwds)
E           AssertionError: 
E           Arrays are not equal
E           
E           Mismatched elements: 2 / 2 (100%)
E           Max absolute difference among violations: 2
E           Max relative difference among violations: 1.
E            ACTUAL: array([1, 4])
E            DESIRED: array([0, 2])

args       = (<built-in function eq>, array([1, 4]), array([0, 2]))
func       = <function assert_array_compare at 0x000002D04D36CB80>
kwds       = {'err_msg': '',
 'header': 'Arrays are not equal',
 'strict': False,
 'verbose': True}
self       = <contextlib._GeneratorContextManager object at 0x000002D04D374220>

C:\Users\Tony\AppData\Local\Programs\Python\Python39\lib\contextlib.py:79: AssertionError


PASSED [ 73%]PASSED [ 80%]FAILED       [ 86%]
aeon\similarity_search\matrix_profiles\tests\test_stomp.py:45 (test_stomp_squared_matrix_profile[3-float64])
dtype = 'float64', k = 3

    @pytest.mark.parametrize("dtype", DATATYPES)
    @pytest.mark.parametrize("k", K_VALUES)
    def test_stomp_squared_matrix_profile(dtype, k):
        """Test naive series search."""
        X = np.asarray(
            [[[1, 2, 3, 4, 5, 6, 7, 8]], [[1, 2, 4, 4, 5, 6, 5, 4]]], dtype=dtype
        )
    
        S = np.asarray([[3, 4, 5, 4, 3, 4, 5, 3, 2, 4, 5]], dtype=dtype)
        L = 3
        mask = np.ones((X.shape[0], X.shape[2] - L + 1), dtype=bool)
        distance = get_distance_function("squared")
        mp, ip = stomp_squared_matrix_profile(X, S, L, mask, k=k)
        for i in range(S.shape[-1] - L + 1):
            q = S[:, i : i + L]
    
            expected = np.array(
                [
                    [distance(q, X[j, :, _i : _i + L]) for _i in range(X.shape[-1] - L + 1)]
                    for j in range(X.shape[0])
                ]
            )
            id_bests = np.vstack(
                np.unravel_index(
                    np.argsort(expected.ravel(), kind="stable"), expected.shape
                )
            ).T
    
            for j in range(k):
                assert_almost_equal(mp[i][j], expected[id_bests[j, 0], id_bests[j, 1]])
>               assert_equal(ip[i][j], id_bests[j])

L          = 3
S          = array([[3., 4., 5., 4., 3., 4., 5., 3., 2., 4., 5.]])
X          = array([[[1., 2., 3., 4., 5., 6., 7., 8.]],

       [[1., 2., 4., 4., 5., 6., 5., 4.]]])
distance   = CPUDispatcher(<function squared_distance at 0x000002D04F561820>)
dtype      = 'float64'
expected   = array([[20., 11.,  8., 11., 20., 35.],
       [21., 10.,  5., 11.,  8.,  3.]])
i          = 2
id_bests   = array([[1, 5],
       [1, 2],
       [0, 2],
       [1, 4],
       [1, 1],
       [0, 1],
       [0, 3],
       [1, 3],
       [0, 0],
       [0, 4],
       [1, 0],
       [0, 5]])
ip         = array([array([[0, 2],
              [1, 2],
              [1, 1]]), array([[1, 2],
                               [0, 2],
                               [1, 4]]), array([[1, 5],
                                                [1, 2],
                                                [1, 4]]), array([[1, 2],
                                                                 [0, 2],
                                                                 [0, 1]]),
       array([[0, 2],
              [1, 2],
              [1, 1]]), array([[1, 2],
                               [1, 5],
                               [1, 1]]), array([[1, 5],
                                                [1, 2],
                                                [0, 1]]), array([[0, 1],
                                                                 [1, 0],
                                                                 [0, 0]]),
       array([[0, 2],
              [1, 1],
              [0, 1]])], dtype=object)
j          = 2
k          = 3
mask       = array([[ True,  True,  True,  True,  True,  True],
       [ True,  True,  True,  True,  True,  True]])
mp         = array([array([0., 1., 2.]), array([2., 3., 3.]), array([3., 5., 8.]),
       array([2., 3., 4.]), array([0., 1., 2.]), array([5., 5., 6.]),
       array([ 9., 11., 13.]), array([2., 4., 5.]), array([1., 1., 2.])],
      dtype=object)
q          = array([[5., 4., 3.]])

test_stomp.py:76: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
..\..\..\..\venv2\lib\site-packages\numpy\_utils\__init__.py:85: in wrapper
    return fun(*args, **kwargs)
        args       = (array([1, 4]), array([0, 2]), '', True)
        dep_version = '2.0.0'
        fun        = <function assert_array_equal at 0x000002D04D36CD30>
        kwargs     = {'strict': False}
        new_name   = 'desired'
        new_names  = ['actual', 'desired']
        old_name   = 'y'
        old_names  = ['x', 'y']
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

args = (<built-in function eq>, array([1, 4]), array([0, 2]))
kwds = {'err_msg': '', 'header': 'Arrays are not equal', 'strict': False, 'verbose': True}

    @wraps(func)
    def inner(*args, **kwds):
        with self._recreate_cm():
>           return func(*args, **kwds)
E           AssertionError: 
E           Arrays are not equal
E           
E           Mismatched elements: 2 / 2 (100%)
E           Max absolute difference among violations: 2
E           Max relative difference among violations: 1.
E            ACTUAL: array([1, 4])
E            DESIRED: array([0, 2])

args       = (<built-in function eq>, array([1, 4]), array([0, 2]))
func       = <function assert_array_compare at 0x000002D04D36CB80>
kwds       = {'err_msg': '',
 'header': 'Arrays are not equal',
 'strict': False,
 'verbose': True}
self       = <contextlib._GeneratorContextManager object at 0x000002D04D374220>

C:\Users\Tony\AppData\Local\Programs\Python\Python39\lib\contextlib.py:79: AssertionError

@TonyBagnall TonyBagnall merged commit 5ad454a into main Oct 21, 2024
20 checks passed
@TonyBagnall TonyBagnall deleted the ajb/numpy2 branch October 21, 2024 19:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
codecov actions Run the codecov action on a PR full examples run Run all examples on a PR full pytest actions Run the full pytest suite on a PR maintenance Continuous integration, unit testing & package distribution
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[ENH] Numpy 2 compatibility
3 participants