Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

drop python 3.7 and testing improvements #249

Merged
merged 8 commits into from
Jun 30, 2023
Merged

drop python 3.7 and testing improvements #249

merged 8 commits into from
Jun 30, 2023

Conversation

orbeckst
Copy link
Member

@orbeckst orbeckst commented Jun 27, 2023

Generic housekeeping PR

drop Python 3.7

make tests/code more robust

cleanup & updates

@codecov
Copy link

codecov bot commented Jun 27, 2023

Codecov Report

Merging #249 (ab102ed) into develop (edc63b7) will increase coverage by 1.16%.
The diff coverage is 90.00%.

@@             Coverage Diff             @@
##           develop     #249      +/-   ##
===========================================
+ Coverage    79.53%   80.69%   +1.16%     
===========================================
  Files           15       15              
  Lines         1881     1906      +25     
  Branches       277      294      +17     
===========================================
+ Hits          1496     1538      +42     
+ Misses         293      276      -17     
  Partials        92       92              
Impacted Files Coverage Δ
mdpow/run.py 71.58% <84.61%> (+7.01%) ⬆️
mdpow/fep.py 68.77% <100.00%> (ø)
mdpow/workflows/base.py 81.63% <100.00%> (ø)

... and 7 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@orbeckst
Copy link
Member Author

@cadeduckworth please review, in particular the fixes to the automated dihedral tests where I am using sets to make them insensitive to the order. I also ensured ordering of project_dirs.

@orbeckst orbeckst requested a review from cadeduckworth June 27, 2023 23:52
@orbeckst
Copy link
Member Author

The dihedral tests now all pass. I am not sure why the other tests

FAILED mdpow/tests/test_analysis.py::TestAnalyze::test_convert_edr - AssertionError: 
Arrays are not almost equal to 5 decimals

Mismatched elements: 1 / 2 (50%)
Max absolute difference: 0.00028187
Max relative difference: 0.00012179
 x: array([-3.72175,  2.31415])
 y: array([-3.72175,  2.31443])
FAILED mdpow/tests/test_analysis.py::TestAnalyze::test_TI - AssertionError: 
Arrays are not almost equal to 5 decimals

Mismatched elements: 1 / 2 (50%)
Max absolute difference: 0.00028187
Max relative difference: 0.00012179
 x: array([-3.72175,  2.31415])
 y: array([-3.72175,  2.31443])

fail.

The differences are small-ish... not sure what changed?!

@cadeduckworth
Copy link
Contributor

cadeduckworth commented Jun 28, 2023 via email

@@ -22,7 +19,7 @@

import py.path

from ..workflows import dihedrals
from mdpow.workflows import dihedrals
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@orbeckst

Did the CI give you any trouble using this import method?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, but the tests should use an installed version of the package instead of the files that sit somewhere next to the tests.

@cadeduckworth
Copy link
Contributor

@orbeckst in response to

How can I tell where the starting code for this PR is branched from before changes?

After checking #243, it seems the same failed tests are showing up with commit bf6884abf1ec81ed705fb578e601d1501519eb57, which merged Develop into that branch.

If that is the case, then those two failing tests are not related to changes you made in this PR.

@cadeduckworth
Copy link
Contributor

Continuation of previous comment:

Develop was last merged into #243 on April 3 with commit 5d8f9601671c4ff50305025d1091801df75d949d and the CI testing report does not indicate that these two new failed tests existed with that merge.

Recent PR History (changes merged into Develop between then and now): #242 was closed on March 27 and #229 was closed on April 3, along with their respectively linked issues.

This PR, #249 was started (3 hours ago) after the aforementioned merge and failed tests showed up in #243 (4 hours ago), which supports the idea that any changes made in this PR are not likely responsible for the new failed tests.

My best guess is that this is a new version issue that we will need to sort out. I am going to try to use diff to sort this out and I will post what I find here.

@cadeduckworth
Copy link
Contributor

diff.txt

possible notable differences:

  • sqlalchemy
  • scipy
  • python
  • pytest

@cadeduckworth
Copy link
Contributor

A smarter check with the environments from this PR:

PR_249_ubuntu_latest_py3.8_fail_py3.9_pass_env_diffs.txt

@cadeduckworth
Copy link
Contributor

The set() method might not work well when testing if the atom indices, bond indices, and dihedral groups are correctly paired for highlighting and labeling plots because we cannot test all possible cases, and some changes in ordering might result in incorrect pairing. The current results might still be correct, for now, but having a valid test will be important for any future RDKit changes.

See failed tests in CI report for test_ab_pairs in #243

@orbeckst
Copy link
Member Author

A smarter check with the environments from this PR:

PR_249_ubuntu_latest_py3.8_fail_py3.9_pass_env_diffs.txt

What do you mean by "smarter check"?

@orbeckst
Copy link
Member Author

I think the main problem is that we are assuming that RDKIT always returns indices in one specific order. If we want this then we should sort the indices ourselves.

If we don't want to do sorting then you could rewrite the tests so that they simply check all combinations that we have seen so far and if one matches, it's ok.

@orbeckst
Copy link
Member Author

I see, https://github.com/Becksteinlab/MDPOW/actions/runs/5395344961/jobs/9797708880?pr=243#step:9:4924 looks more different than just reordering.

Which one is correct?

FAILED mdpow/tests/test_automated_dihedral_analysis.py::TestAutomatedDihedralAnalysis::test_ab_pairs - AssertionError: assert {'C10-C5-S4-O...22, 26)), ...} == {'C10-C5-S4-O...21, 26)), ...}
  Differing items:
  {'C2-N3-S4-O12': ((1, 2, 3, 11), (1, 3, 7))} != {'C2-N3-S4-O12': ((1, 2, 3, 11), (4, 2, 0))}
  {'C13-C14-C15-C16': ((12, 13, 14, 15), (19, 22, 25))} != {'C13-C14-C15-C16': ((12, 13, 14, 15), (14, 21, 25))}
  {'C2-C13-C14-C15': ((1, 12, 13, 14), (2, 19, 22))} != {'C2-C13-C14-C15': ((1, 12, 13, 14), (9, 14, 21))}
  {'O1-C2-N3-S4': ((0, 1, 2, 3), (0, 1, 3))} != {'O1-C2-N3-S4': ((0, 1, 2, 3), (8, 4, 2))}
  {'N3-S4-C5-C10': ((2, 3, 4, 9), (3, 5, 9))} != {'N3-S4-C5-C10': ((2, 3, 4, 9), (2, 3, 7))}
  {'N3-C2-C13-C14': ((2, 1, 12, 13), (1, 2, 19))} != {'N3-C2-C13-C14': ((2, 1, 12, 13), (4, 9, 14))}
  {'C6-C5-S4-O11': ((5, 4, 3, 10), (8, 5, 6))} != {'C6-C5-S4-O11': ((5, 4, 3, 10), (6, 3, 1))}
  {'C10-C5-S4-O11': ((9, 4, 3, 10), (9, 5, 6))} != {'C10-C5-S4-O11': ((9, 4, 3, 10), (7, 3, 1))}
  {'N3-S4-C5-C6': ((2, 3, 4, 5), (3, 5, 8))} != {'N3-S4-C5-C6': ((2, 3, 4, 5), (2, 3, 6))}
  {'S4-N3-C2-C13': ((3, 2, 1, 12), (3, 1, 2))} != {'S4-N3-C2-C13': ((3, 2, 1, 12), (2, 4, 9))}
  {'C2-N3-S4-C5': ((1, 2, 3, 4), (1, 3, 5))} != {'C2-N3-S4-C5': ((1, 2, 3, 4), (4, 2, 3))}
  {'C6-C5-S4-O12': ((5, 4, 3, 11), (8, 5, 7))} != {'C6-C5-S4-O12': ((5, 4, 3, 11), (6, 3, 0))}
  {'O1-C2-C13-C14': ((0, 1, 12, 13), (0, 2, 19))} != {'O1-C2-C13-C14': ((0, 1, 12, 13), (8, 9, 14))}
  {'C10-C5-S4-O12': ((9, 4, 3, 11), (9, 5, 7))} != {'C10-C5-S4-O12': ((9, 4, 3, 11), (7, 3, 0))}
  {'C2-N3-S4-O11': ((1, 2, 3, 10), (1, 3, 6))} != {'C2-N3-S4-O11': ((1, 2, 3, 10), (4, 2, 1))}
  {'C13-C14-C15-C20': ((12, 13, 14, 19), (19, 22, 26))} != {'C13-C14-C15-C20': ((12, 13, 14, 19), (14, 21, 26))}
  Full diff:
    {
  -  'C10-C5-S4-O11': ((9, 4, 3, 10), (7, 3, 1)),
  ?                                    ^  ^  ^
  +  'C10-C5-S4-O11': ((9, 4, 3, 10), (9, 5, 6)),
  ?                                    ^  ^  ^
  -  'C10-C5-S4-O12': ((9, 4, 3, 11), (7, 3, 0)),
  ?                                     ------
  +  'C10-C5-S4-O12': ((9, 4, 3, 11), (9, 5, 7)),
  ?                                    ++++++
  -  'C13-C14-C15-C16': ((12, 13, 14, 15), (14, 21, 25)),
  ?                                          ^   ^
  +  'C13-C14-C15-C16': ((12, 13, 14, 15), (19, 22, 25)),
  ?                                          ^   ^
  -  'C13-C14-C15-C20': ((12, 13, 14, 19), (14, 21, 26)),
  ?                                          ^   ^
  +  'C13-C14-C15-C20': ((12, 13, 14, 19), (19, 22, 26)),
  ?                                          ^   ^
  -  'C2-C13-C14-C15': ((1, 12, 13, 14), (9, 14, 21)),
  ?                                          ---- ^
  +  'C2-C13-C14-C15': ((1, 12, 13, 14), (2, 19, 22)),
  ?                                       ++++    ^
  -  'C2-N3-S4-C5': ((1, 2, 3, 4), (4, 2, 3)),
  ?                                 ^  ^  ^
  +  'C2-N3-S4-C5': ((1, 2, 3, 4), (1, 3, 5)),
  ?                                 ^  ^  ^
  -  'C2-N3-S4-O11': ((1, 2, 3, 10), (4, 2, 1)),
  ?                                   ^  ^  ^
  +  'C2-N3-S4-O11': ((1, 2, 3, 10), (1, 3, 6)),
  ?                                   ^  ^  ^
  -  'C2-N3-S4-O12': ((1, 2, 3, 11), (4, 2, 0)),
  ?                                   ^  ^  ^
  +  'C2-N3-S4-O12': ((1, 2, 3, 11), (1, 3, 7)),
  ?                                   ^  ^  ^
  -  'C6-C5-S4-O11': ((5, 4, 3, 10), (6, 3, 1)),
  ?                                    ------
  +  'C6-C5-S4-O11': ((5, 4, 3, 10), (8, 5, 6)),
  ?                                   ++++++
  -  'C6-C5-S4-O12': ((5, 4, 3, 11), (6, 3, 0)),
  ?                                   ^  ^  ^
  +  'C6-C5-S4-O12': ((5, 4, 3, 11), (8, 5, 7)),
  ?                                   ^  ^  ^
  -  'N3-C2-C13-C14': ((2, 1, 12, 13), (4, 9, 14)),
  ?                                     ^   ----
  +  'N3-C2-C13-C14': ((2, 1, 12, 13), (1, 2, 19)),
  ?                                     ^  ++++
  -  'N3-S4-C5-C10': ((2, 3, 4, 9), (2, 3, 7)),
  ?                                  ---   ^
  +  'N3-S4-C5-C10': ((2, 3, 4, 9), (3, 5, 9)),
  ?                                     ^^^^
  -  'N3-S4-C5-C6': ((2, 3, 4, 5), (2, 3, 6)),
  ?                                 ---   ^
  +  'N3-S4-C5-C6': ((2, 3, 4, 5), (3, 5, 8)),
  ?                                    ^^^^
  -  'O1-C2-C13-C14': ((0, 1, 12, 13), (8, 9, 14)),
  ?                                     ^   ----
  +  'O1-C2-C13-C14': ((0, 1, 12, 13), (0, 2, 19)),
  ?                                     ^  ++++
  -  'O1-C2-N3-S4': ((0, 1, 2, 3), (8, 4, 2)),
  ?                                 ^  ^  ^
  +  'O1-C2-N3-S4': ((0, 1, 2, 3), (0, 1, 3)),
  ?                                 ^  ^  ^
  -  'S4-N3-C2-C13': ((3, 2, 1, 12), (2, 4, 9)),
  ?                                    ------
  +  'S4-N3-C2-C13': ((3, 2, 1, 12), (3, 1, 2)),
  ?                                   ++++++
    }

@cadeduckworth
Copy link
Contributor

A smarter check with the environments from this PR:
PR_249_ubuntu_latest_py3.8_fail_py3.9_pass_env_diffs.txt

What do you mean by "smarter check"?

For the first check of differences, I forgot that PR #243 still has a commit that specifies the version of RDKit, which might affect other dependencies. Then I decided it made more sense to check for differences between the most similar cases where the failed tests first occur, in this PR.

@cadeduckworth
Copy link
Contributor

I see, https://github.com/Becksteinlab/MDPOW/actions/runs/5395344961/jobs/9797708880?pr=243#step:9:4924 looks more different than just reordering.

Which one is correct?

FAILED mdpow/tests/test_automated_dihedral_analysis.py::TestAutomatedDihedralAnalysis::test_ab_pairs - AssertionError: assert {'C10-C5-S4-O...22, 26)), ...} == {'C10-C5-S4-O...21, 26)), ...}
  Differing items:
  {'C2-N3-S4-O12': ((1, 2, 3, 11), (1, 3, 7))} != {'C2-N3-S4-O12': ((1, 2, 3, 11), (4, 2, 0))}
  {'C13-C14-C15-C16': ((12, 13, 14, 15), (19, 22, 25))} != {'C13-C14-C15-C16': ((12, 13, 14, 15), (14, 21, 25))}
  {'C2-C13-C14-C15': ((1, 12, 13, 14), (2, 19, 22))} != {'C2-C13-C14-C15': ((1, 12, 13, 14), (9, 14, 21))}
  {'O1-C2-N3-S4': ((0, 1, 2, 3), (0, 1, 3))} != {'O1-C2-N3-S4': ((0, 1, 2, 3), (8, 4, 2))}
  {'N3-S4-C5-C10': ((2, 3, 4, 9), (3, 5, 9))} != {'N3-S4-C5-C10': ((2, 3, 4, 9), (2, 3, 7))}
  {'N3-C2-C13-C14': ((2, 1, 12, 13), (1, 2, 19))} != {'N3-C2-C13-C14': ((2, 1, 12, 13), (4, 9, 14))}
  {'C6-C5-S4-O11': ((5, 4, 3, 10), (8, 5, 6))} != {'C6-C5-S4-O11': ((5, 4, 3, 10), (6, 3, 1))}
  {'C10-C5-S4-O11': ((9, 4, 3, 10), (9, 5, 6))} != {'C10-C5-S4-O11': ((9, 4, 3, 10), (7, 3, 1))}
  {'N3-S4-C5-C6': ((2, 3, 4, 5), (3, 5, 8))} != {'N3-S4-C5-C6': ((2, 3, 4, 5), (2, 3, 6))}
  {'S4-N3-C2-C13': ((3, 2, 1, 12), (3, 1, 2))} != {'S4-N3-C2-C13': ((3, 2, 1, 12), (2, 4, 9))}
  {'C2-N3-S4-C5': ((1, 2, 3, 4), (1, 3, 5))} != {'C2-N3-S4-C5': ((1, 2, 3, 4), (4, 2, 3))}
  {'C6-C5-S4-O12': ((5, 4, 3, 11), (8, 5, 7))} != {'C6-C5-S4-O12': ((5, 4, 3, 11), (6, 3, 0))}
  {'O1-C2-C13-C14': ((0, 1, 12, 13), (0, 2, 19))} != {'O1-C2-C13-C14': ((0, 1, 12, 13), (8, 9, 14))}
  {'C10-C5-S4-O12': ((9, 4, 3, 11), (9, 5, 7))} != {'C10-C5-S4-O12': ((9, 4, 3, 11), (7, 3, 0))}
  {'C2-N3-S4-O11': ((1, 2, 3, 10), (1, 3, 6))} != {'C2-N3-S4-O11': ((1, 2, 3, 10), (4, 2, 1))}
  {'C13-C14-C15-C20': ((12, 13, 14, 19), (19, 22, 26))} != {'C13-C14-C15-C20': ((12, 13, 14, 19), (14, 21, 26))}
  Full diff:
    {
  -  'C10-C5-S4-O11': ((9, 4, 3, 10), (7, 3, 1)),
  ?                                    ^  ^  ^
  +  'C10-C5-S4-O11': ((9, 4, 3, 10), (9, 5, 6)),
  ?                                    ^  ^  ^
  -  'C10-C5-S4-O12': ((9, 4, 3, 11), (7, 3, 0)),
  ?                                     ------
  +  'C10-C5-S4-O12': ((9, 4, 3, 11), (9, 5, 7)),
  ?                                    ++++++
  -  'C13-C14-C15-C16': ((12, 13, 14, 15), (14, 21, 25)),
  ?                                          ^   ^
  +  'C13-C14-C15-C16': ((12, 13, 14, 15), (19, 22, 25)),
  ?                                          ^   ^
  -  'C13-C14-C15-C20': ((12, 13, 14, 19), (14, 21, 26)),
  ?                                          ^   ^
  +  'C13-C14-C15-C20': ((12, 13, 14, 19), (19, 22, 26)),
  ?                                          ^   ^
  -  'C2-C13-C14-C15': ((1, 12, 13, 14), (9, 14, 21)),
  ?                                          ---- ^
  +  'C2-C13-C14-C15': ((1, 12, 13, 14), (2, 19, 22)),
  ?                                       ++++    ^
  -  'C2-N3-S4-C5': ((1, 2, 3, 4), (4, 2, 3)),
  ?                                 ^  ^  ^
  +  'C2-N3-S4-C5': ((1, 2, 3, 4), (1, 3, 5)),
  ?                                 ^  ^  ^
  -  'C2-N3-S4-O11': ((1, 2, 3, 10), (4, 2, 1)),
  ?                                   ^  ^  ^
  +  'C2-N3-S4-O11': ((1, 2, 3, 10), (1, 3, 6)),
  ?                                   ^  ^  ^
  -  'C2-N3-S4-O12': ((1, 2, 3, 11), (4, 2, 0)),
  ?                                   ^  ^  ^
  +  'C2-N3-S4-O12': ((1, 2, 3, 11), (1, 3, 7)),
  ?                                   ^  ^  ^
  -  'C6-C5-S4-O11': ((5, 4, 3, 10), (6, 3, 1)),
  ?                                    ------
  +  'C6-C5-S4-O11': ((5, 4, 3, 10), (8, 5, 6)),
  ?                                   ++++++
  -  'C6-C5-S4-O12': ((5, 4, 3, 11), (6, 3, 0)),
  ?                                   ^  ^  ^
  +  'C6-C5-S4-O12': ((5, 4, 3, 11), (8, 5, 7)),
  ?                                   ^  ^  ^
  -  'N3-C2-C13-C14': ((2, 1, 12, 13), (4, 9, 14)),
  ?                                     ^   ----
  +  'N3-C2-C13-C14': ((2, 1, 12, 13), (1, 2, 19)),
  ?                                     ^  ++++
  -  'N3-S4-C5-C10': ((2, 3, 4, 9), (2, 3, 7)),
  ?                                  ---   ^
  +  'N3-S4-C5-C10': ((2, 3, 4, 9), (3, 5, 9)),
  ?                                     ^^^^
  -  'N3-S4-C5-C6': ((2, 3, 4, 5), (2, 3, 6)),
  ?                                 ---   ^
  +  'N3-S4-C5-C6': ((2, 3, 4, 5), (3, 5, 8)),
  ?                                    ^^^^
  -  'O1-C2-C13-C14': ((0, 1, 12, 13), (8, 9, 14)),
  ?                                     ^   ----
  +  'O1-C2-C13-C14': ((0, 1, 12, 13), (0, 2, 19)),
  ?                                     ^  ++++
  -  'O1-C2-N3-S4': ((0, 1, 2, 3), (8, 4, 2)),
  ?                                 ^  ^  ^
  +  'O1-C2-N3-S4': ((0, 1, 2, 3), (0, 1, 3)),
  ?                                 ^  ^  ^
  -  'S4-N3-C2-C13': ((3, 2, 1, 12), (2, 4, 9)),
  ?                                    ------
  +  'S4-N3-C2-C13': ((3, 2, 1, 12), (3, 1, 2)),
  ?                                   ++++++
    }

Hey Oliver,

Whenever I generate the comparison values for tests I usually do it in a notebook to have a record of how those values were obtained. if I have time I will try to review this one before our meeting tomorrow.

It is my understanding that the order the Mol object is 'constructed' has changed. For some of the more symmetrical molecules this might appear to be a reordering if there are similar dihedral atom groups, when it is actually a new indexing scheme. For example (hypothetical), C1-C2-C3-O4 and O12-C1-C2-C3 could give incorrect or misleading results if O4 and O12 are indexed in the opposite direction. I think I remember reading it is based on the types of bonds, which determine the direction in which the Mol object is constructed and indexed.

@orbeckst
Copy link
Member Author

I created two local envs on osX for 3.8 and 3.10 and

pytest -v mdpow/tests/test_analysis.py -k test_TI

passes in both envs.

@orbeckst
Copy link
Member Author

Hi @cadeduckworth, from #249 (comment)

It is my understanding that the order the Mol object is 'constructed' has changed.

Are you saying that RDKIT mol object may contain atoms in a different order from the MDAnalysis Universe? That would be bad because we are relying on being able to do

mol = u.atoms.convert_to("RDKIT")
atom_indices = mol.GetSubstructMatches(pattern)
ag = u.atoms[atom_indices[0]]

@cadeduckworth
Copy link
Contributor

cadeduckworth commented Jun 29, 2023 via email

@orbeckst orbeckst force-pushed the drop-37 branch 2 times, most recently from d6ed490 to 62d7fa5 Compare June 30, 2023 01:21
orbeckst added 6 commits June 29, 2023 18:24
- use set comparisons
- removed skipif for python < 3.8 and replaced with comment referencing
  issue #239
- On CI, TestAnalyze.test_TI failed for Python > 3.8 and the computed values for the
  free energy error estimate differed from reference values. The free energies themselves
  remained the same. These error estimates are computed in a complicated manner from
  error propagation through simpsons rule via numkit/scipy. It is possible that there
  are subtle changes. However, this is not very important functionality anymore because
  we use alchemlyb.
- Reduced comparison precision to decimals=3 to make the tests pass robustly.
- Locally on macOS, @orbeckst could not reproduce the different values (they always
  came out exactly as the original reference, regardless of working in a 3.8 or 3.10
  environment)
- add exit_on_error=True kwarg to make it possible to just raise exceptions or
  return values instead of sys.exit; default True is old behavior
- no real testing
- add new kwarg exit_on_error=True|False to run runMD_or_exit() to only raise
  exceptions instead of calling sys.exit(); default is old behavior (True)
- added tests for runMD_or_exit() failure modes
- use better error handling in runMD_or_exit to check if we triggered the
  non-bonded interactions beyond cutoff issue and if so, pass with xfail
- close #175
@orbeckst orbeckst changed the title drop python 3.7 drop python 3.7 and testing improvements Jun 30, 2023
- close #254
- renamed setup -> setup_method and teardown -> teardown_method but ultimately these tests should
  be rewritten with fixtures
- fixed relative imports from package in tests: mdpow imports should be from the installed package
@orbeckst orbeckst merged commit 172b354 into develop Jun 30, 2023
@orbeckst orbeckst deleted the drop-37 branch June 30, 2023 02:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants