Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set up tox-conda/pytest #1

Merged
merged 15 commits into from
Nov 22, 2024
Merged

Set up tox-conda/pytest #1

merged 15 commits into from
Nov 22, 2024

Conversation

ajjackson
Copy link
Collaborator

@ajjackson ajjackson commented Nov 7, 2024

  • Set build-backend so isolated pip install works
  • Add pychop as testing dependency
  • Configure tox-conda
  • Create __init__.py files under tests/ so that all tests are found by pytest

We need to use conda to install Mantid and check results against Abins functions. Tox-conda is stuck on tox v3; there is a branch to support v4 but it's not clear if/when this will be finished. But we also might be able to drop the conda component of these tests once we are satisfied with validation against Mantid: it can be replaced with tests against reference values.

Perhaps we don't need the test dependencies at all in pyproject.toml, now this is all defined in tox.ini. But they could be useful for running tests without tox if desired.

- Set build-backend so isolated pip install works
- Add pychop as testing dependency
- Configure tox-conda
- Create __init__.py files under tests/ so that all tests are
  found by pytest

We need to use conda to install Mantid and check results against Abins
functions. Tox-conda is stuck on tox v3; there is a branch to support
v4 but it's not clear if/when this will be finished. But we also might
be able to drop the conda component of these tests once we are
satisfied with validation against Mantid: it can be replaced with
tests against reference values.

Perhaps we don't need the test dependencies at all in pyproject.toml,
now this is all defined in tox.ini. But they could be useful for
running tests without tox if desired.
@ajjackson
Copy link
Collaborator Author

This is a decent start, the tests seem to be running correctly. Lots are failing, but we expected that.

- Use reservoir sampling to efficiently grab a subset of Fermi chopper
  frequency combinations
- Simplify error matching with match= argument
- Get id on the fly with a function rather than precomputing
- defer string ids to a function run by test
- replace explicit list-extending with itertools.chain
The sweep of incident energies is cut down by some more (seeded)
random sampling.

At the moment we have some cases with invalid chopper frequencies;
make sure the error is checked robustly when handling these.

Ideally we should have smarter test-case generation and check the
errors in a separate test
These aren't currently being checked in CI because tox takes care of
it, so can easily become stale. There must be a better way.
@ajjackson ajjackson marked this pull request as ready for review November 8, 2024 15:30
@ajjackson
Copy link
Collaborator Author

There's more work to do on test coverage / legibility / efficiency but this at least gets some CI up and running without too much complexity or runtime. Would be good to fix the failing tests next!

@RastislavTuranyi
Copy link
Collaborator

The MAPS and MARI integration tests against AbINS that are failing in CI, on my local they are all passing.

The tests seem to be failing due to NoTransmissionError, but that case is expected (it should go into the except block). The interesting thing is that AbINS is not raising a LinAlgError for the invalid combinations of energy and frequency, causing our strategy to fail. Maybe the CI is using a different version of Mantid? Have there been any relevant changes to AbINS that change the behaviour that the tests expect?

    try:
        abins._polyfits = {}
        expected = abins.calculate_sigma(frequencies * MEV_TO_WAVENUMBER) * WAVENUMBER_TO_MEV
    except LinAlgError:
        with pytest.raises(NoTransmissionError):
            rf_2d.get_resolution_function('PyChop_fit', chopper_package=setting, e_init=energy, chopper_frequency=chopper_frequency)
    else:
        rf = rf_2d.get_resolution_function('PyChop_fit', chopper_package=setting, e_init=energy,
                                           chopper_frequency=chopper_frequency)
        actual = rf(frequencies)

        assert_allclose(actual, expected, rtol=1e-5)

Concerning the failing Lagrange tests, those are due to the issue in AbINS where the cut-off values have a bug in the unit conversion, resulting in the low-energy values being incorrect.

@ajjackson
Copy link
Collaborator Author

Presumably that means there is no LinAlgError first. LinAlgError error comes from numpy, perhaps it depends on the Numpy/Scipy version?

I get the same errors when running with tox on a Rocky 8 VM, so there isn't anything weird going on with the Github runner.

I have tried Mantid versions 6.9, 6.10, 6.11; 6.9 segfaults (probably numpy trouble) and the two recent versions give the same MAPS/MARI errors

@ajjackson
Copy link
Collaborator Author

ajjackson commented Nov 11, 2024

In Mantid, if I try to call calculate_sigma() at invalid energy/chopper settings I get NaN back; it doesn't raise an error at this point. How do you get the LinAlgError to appear?

(It probably should, this would give better user feedback.)

Screenshot 2024-11-11 at 10 10 08

@RastislavTuranyi
Copy link
Collaborator

RastislavTuranyi commented Nov 11, 2024

I get this in tests without the except block (i.e. np.polyfit raising a LinAlgError):

tests\integration_tests\test_pychop.py:57 (test_against_abins[MAPS_A-ei=50,f=450])
matrix = (50, 450)
abins_rf_2d = (<abins.instruments.pychop.PyChopInstrument object at 0x000001E292EC78E0>, Instrument(name='MAPS', version='MAPS', mod...'tjit': 0.0}, 'S': {'pslit': 0.002899, 'radius': 0.049, 'rho': 1.3, 'tjit': 0.0}}}}}, default_model='PyChop_fit'), 'A')

    @pytest.mark.parametrize('matrix', EF_MATRIX, ids=EF_IDS)
    def test_against_abins(matrix, abins_rf_2d,):
        abins, rf_2d, setting = abins_rf_2d
>       _test_against_abins(abins, rf_2d, setting, matrix)

integration_tests\test_pychop.py:61: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
integration_tests\test_pychop.py:73: in _test_against_abins
    expected = abins.calculate_sigma(frequencies * MEV_TO_WAVENUMBER) * WAVENUMBER_TO_MEV
..\..\..\..\AppData\Local\miniconda3\envs\rfd\Library\scripts\abins\instruments\pychop.py:53: in calculate_sigma
    self._polyfit_resolution()
..\..\..\..\AppData\Local\miniconda3\envs\rfd\Library\scripts\abins\instruments\pychop.py:66: in _polyfit_resolution
    fit = np.polyfit(frequencies_invcm, resolution_sigma * MILLI_EV_TO_WAVENUMBER, order)
..\..\..\..\AppData\Local\miniconda3\envs\rfd\lib\site-packages\numpy\lib\polynomial.py:669: in polyfit
    c, resids, rank, s = lstsq(lhs, rhs, rcond)
..\..\..\..\AppData\Local\miniconda3\envs\rfd\lib\site-packages\numpy\linalg\linalg.py:2326: in lstsq
    x, resids, rank, s = gufunc(a, b, rcond, signature=signature, extobj=extobj)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

err = 'invalid value', flag = 8

    def _raise_linalgerror_lstsq(err, flag):
>       raise LinAlgError("SVD did not converge in Linear Least Squares")
E       numpy.linalg.LinAlgError: SVD did not converge in Linear Least Squares

..\..\..\..\AppData\Local\miniconda3\envs\rfd\lib\site-packages\numpy\linalg\linalg.py:124: LinAlgError

We are seeing some interesting inconsistencies in test results between
Rastislav's local Windows installation and my Linux/tox setup.
@RastislavTuranyi
Copy link
Collaborator

Could indeed be a Windows issue (e.g. this numpy problem caused by a bug in Windows, which should be resolved, but there could be something happening again?).

In either case, we could add a check for nans to the else statement:

    try:
        abins._polyfits = {}
        expected = abins.calculate_sigma(frequencies * MEV_TO_WAVENUMBER) * WAVENUMBER_TO_MEV
    except LinAlgError:
        with pytest.raises(NoTransmissionError):
            rf_2d.get_resolution_function('PyChop_fit', chopper_package=setting, e_init=energy, chopper_frequency=chopper_frequency)
    else:
        if np.any(np.isnan(expected)):
            with pytest.raises(NoTransmissionError):
                rf_2d.get_resolution_function('PyChop_fit', chopper_package=setting, e_init=energy, chopper_frequency=chopper_frequency)
            return

        rf = rf_2d.get_resolution_function('PyChop_fit', chopper_package=setting, e_init=energy,
                                           chopper_frequency=chopper_frequency)
        actual = rf(frequencies)

        assert_allclose(actual, expected, rtol=1e-5)

We know some tests are failing regardless, but want to see what
happens on Windows!
@ajjackson
Copy link
Collaborator Author

Whether it is a "bug" or not, on our side it is probably safest not to assume that NaN values will always be treated the same way across math libraries and operating systems.

@ajjackson
Copy link
Collaborator Author

Yup, on Windows there are only the Lagrange failures so it is taking different paths depending on platform/libraries 😭

When testing on Linux, Abins returns NaN rather than raise LinAlgError
when the chopper settings are out of range. Our test should detect
both cases and ensure appropriate behaviour in corresponding
situation. (i.e. raise NoTransmissionError)
@RastislavTuranyi
Copy link
Collaborator

PyChop issues a warning (warnings.warn) when it returns a nan due to no transmission; we could take advantage of that:

    abins._polyfits = {}

    with warnings.catch_warnings():
        warnings.filterwarnings('error', message='PyChop: tchop\(\): No transmission.*')
        try:
            expected = abins.calculate_sigma(frequencies * MEV_TO_WAVENUMBER) * WAVENUMBER_TO_MEV
        except Warning:
            with pytest.raises(NoTransmissionError):
                rf_2d.get_resolution_function('PyChop_fit', chopper_package=setting, e_init=energy, chopper_frequency=chopper_frequency)
        else:
            rf = rf_2d.get_resolution_function('PyChop_fit', chopper_package=setting, e_init=energy,
                                               chopper_frequency=chopper_frequency)
            actual = rf(frequencies)

            assert_allclose(actual, expected, rtol=1e-5)

@ajjackson
Copy link
Collaborator Author

NaN check is working, will give warning a try and see which is nicer

This is a bit cleaner than having separate checks for different
OS/Library situations
There is a unit-conversion bug in the low-frequency limit so this
doesn't kick in at the right place in Abins. When that is fixed we can
de-skip the test and verify it here.
@ajjackson
Copy link
Collaborator Author

ajjackson commented Nov 11, 2024

Warning-based logic looks a bit cleaner to me and seems to work across platforms 👍

I have set the low-frequency Abins-Lagrange comparison to skip, as we know that one is suspect. Higher frequencies are ok but I needed to slightly loosen the assert_allclose tolerance.

I'd still like to see some of these parameter sweeps collapsed into single test cases, but this seems useful enough to merge now?

@RastislavTuranyi RastislavTuranyi merged commit 4bf9324 into main Nov 22, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants