Rework Combination class #26

kaminow · 2023-08-28T16:01:47Z

The previous iteration of the Combination classes required the computation graph for each pose to be held in GPU memory, which will quickly overflow normal GPUs when using all-atom poses. The new version splits the gradient calculation such that the gradient for each pose is done separately and combined appropriately at the end, meaning that each computation graph can be freed from memory after use. The derivation for the math used in the different Combination subclasses can be found in the README_COMBINATION.md file.

General list of changes for each file:

README_COMBINATION.md
Math for separating out the gradients in the Combination classes

mtenn/combination.py
Each method for combining predictions has a torch.autograd.Function, which takes care of combining and assigning the gradients in the backward pass, and a Combination subclass that is essentially a wrapper around the Function

mtenn/conversion_utils/e3nn.py

Update import statements
Add ComplexOnlyStrategy

mtenn/conversion_utils/schnet.py

Update import statements
Add ComplexOnlyStrategy

mtenn/model.py

Move all non-Model classes to their own files
Update GroupedModel forward pass to work with new Combination setup
GroupedModel now returns list of predictions for each pose in addition to the final prediction

mtenn/readout.py
Move all Readout-related code

mtenn/representation.py
Move all Representation-related code

mtenn/Strategy.py

Move all Strategy-related code
Add ComplexOnlyStrategy class that only predicts on the full input

…for individual predictions.

…tion.

hmacdope

Few small things, but looking really good. Addressing failing test(s) is main issue, but good that we are putting them in / making progress towards checking correctness.

mtenn/combination.py

kaminow · 2023-10-18T15:00:24Z

after playing around with the tests a bit, it seems like it's just a stochastic failure based on how the random data is initialized. two workarounds that I can think of are:

find a random seed that lets all the tests through as is, and trust that if the math gets messed up at some point then the tests will fail
adjust the parameters to the np.allclose call to be more lenient

@hmacdope do you have thoughts as to which would be better/preferable?

codecov-commenter · 2023-10-18T15:09:42Z

Codecov Report

Merging #26 (58c43f2) into main (54c94b0) will increase coverage by 31.92%.
The diff coverage is 84.01%.

Additional details and impacted files

hmacdope

Looks good @kaminow, merge when ready.

Lets hold off on a release until we sort out CI and env build issues.

hmacdope · 2023-10-20T00:16:45Z

@kaminow I have fixes a missing ase (?) dep in tests and updated some env files.

kaminow · 2023-10-23T13:07:49Z

@hmacdope thanks! any thoughts on why things are still failing for Ubuntu 3.11?

hmacdope · 2023-10-23T20:40:42Z

I will investigate, seems odd.

kaminow · 2023-10-31T20:22:25Z

@hmacdope after some investigation, it seems that there's some requirements broken for the 3.11 version of pytorch_geometric. it seems to be requiring cuda for some reason, while the builds for older Python versions don't, so for Python 3.11 an older version of pytorch_geometric is being installed, prior to when the interaction_graph was added to the model

kaminow · 2023-10-31T20:23:27Z

for posterity, this is the error I get when I try to run mamba install pytorch_geometric=2.3.1 in a Python 3.11 env:

warning  libmamba Added empty dependency for problem type SOLVER_RULE_UPDATE
Could not solve for environment specs
The following package could not be installed
└─ pytorch_geometric 2.3.1**  is installable and it requires
   └─ pyg-lib 0.2.0  with the potential options
      ├─ pyg-lib 0.2.0 would require
      │  └─ triton with the potential options
      │     ├─ triton [1.1.2|2.0.0] would require
      │     │  └─ python >=3.10,<3.11.0a0 , which can be installed;
      │     ├─ triton 1.1.2 would require
      │     │  └─ python >=3.7,<3.8.0a0 , which can be installed;
      │     ├─ triton [1.1.2|2.0.0] would require
      │     │  └─ python >=3.8,<3.9.0a0 , which can be installed;
      │     ├─ triton [1.1.2|2.0.0] would require
      │     │  └─ python >=3.9,<3.10.0a0 , which can be installed;
      │     └─ triton 2.0.0 would require
      │        └─ pytorch * cuda* with the potential options
      │           ├─ pytorch [1.0.1|1.1.0|...|1.9.1] would require
      │           │  └─ python >=3.7,<3.8.0a0 , which can be installed;
      │           ├─ pytorch [1.10.0|1.10.1|...|1.9.1] would require
      │           │  └─ python >=3.8,<3.9.0a0 , which can be installed;
      │           ├─ pytorch [1.10.0|1.10.1|...|1.9.1] would require
      │           │  └─ python >=3.9,<3.10.0a0 , which can be installed;
      │           ├─ pytorch 1.11.0 would require
      │           │  └─ python >=3.10,<3.11.0a0 , which can be installed;
      │           ├─ pytorch [1.11.0|1.12.0|...|2.0.0] would require
      │           │  └─ __cuda, which is missing on the system;
      │           ├─ pytorch [1.0.1|1.1.0|...|1.9.1] would require
      │           │  └─ python >=3.6,<3.7.0a0 , which can be installed;
      │           ├─ pytorch [1.0.1|1.1.0|1.2.0|1.3.1] would require
      │           │  └─ python >=2.7,<2.8.0a0 , which can be installed;
      │           └─ pytorch 1.0.1 would require
      │              └─ cudatoolkit >=8.0,<8.1.0a0 , which does not exist (perhaps a missing channel);
      ├─ pyg-lib 0.2.0 would require
      │  └─ python >=3.10,<3.11.0a0 , which can be installed;
      ├─ pyg-lib 0.2.0 would require
      │  └─ python >=3.8,<3.9.0a0 , which can be installed;
      └─ pyg-lib 0.2.0 would require
         └─ python >=3.9,<3.10.0a0 , which can be installed.

hmacdope · 2023-10-31T23:32:46Z

@kaminow let me take a quick look on their feedstock.

hmacdope · 2023-10-31T23:36:28Z

for posterity, this is the error I get when I try to run mamba install pytorch_geometric=2.3.1 in a Python 3.11 env:

warning  libmamba Added empty dependency for problem type SOLVER_RULE_UPDATE
Could not solve for environment specs
The following package could not be installed
└─ pytorch_geometric 2.3.1**  is installable and it requires
   └─ pyg-lib 0.2.0  with the potential options
      ├─ pyg-lib 0.2.0 would require
      │  └─ triton with the potential options
      │     ├─ triton [1.1.2|2.0.0] would require
      │     │  └─ python >=3.10,<3.11.0a0 , which can be installed;
      │     ├─ triton 1.1.2 would require
      │     │  └─ python >=3.7,<3.8.0a0 , which can be installed;
      │     ├─ triton [1.1.2|2.0.0] would require
      │     │  └─ python >=3.8,<3.9.0a0 , which can be installed;
      │     ├─ triton [1.1.2|2.0.0] would require
      │     │  └─ python >=3.9,<3.10.0a0 , which can be installed;
      │     └─ triton 2.0.0 would require
      │        └─ pytorch * cuda* with the potential options
      │           ├─ pytorch [1.0.1|1.1.0|...|1.9.1] would require
      │           │  └─ python >=3.7,<3.8.0a0 , which can be installed;
      │           ├─ pytorch [1.10.0|1.10.1|...|1.9.1] would require
      │           │  └─ python >=3.8,<3.9.0a0 , which can be installed;
      │           ├─ pytorch [1.10.0|1.10.1|...|1.9.1] would require
      │           │  └─ python >=3.9,<3.10.0a0 , which can be installed;
      │           ├─ pytorch 1.11.0 would require
      │           │  └─ python >=3.10,<3.11.0a0 , which can be installed;
      │           ├─ pytorch [1.11.0|1.12.0|...|2.0.0] would require
      │           │  └─ __cuda, which is missing on the system;
      │           ├─ pytorch [1.0.1|1.1.0|...|1.9.1] would require
      │           │  └─ python >=3.6,<3.7.0a0 , which can be installed;
      │           ├─ pytorch [1.0.1|1.1.0|1.2.0|1.3.1] would require
      │           │  └─ python >=2.7,<2.8.0a0 , which can be installed;
      │           └─ pytorch 1.0.1 would require
      │              └─ cudatoolkit >=8.0,<8.1.0a0 , which does not exist (perhaps a missing channel);
      ├─ pyg-lib 0.2.0 would require
      │  └─ python >=3.10,<3.11.0a0 , which can be installed;
      ├─ pyg-lib 0.2.0 would require
      │  └─ python >=3.8,<3.9.0a0 , which can be installed;
      └─ pyg-lib 0.2.0 would require
         └─ python >=3.9,<3.10.0a0 , which can be installed.

Pinging @mikemhenry as well as I see he is a maintainer on the PYG feedstock

hmacdope · 2023-10-31T23:43:28Z

We can try a pin also in the meantime.

hmacdope · 2023-11-01T00:04:49Z

I am fairly sure this is due to the exact pin of pyg-lib==0.2.0 in the pyg feedstock which is pulling down old pytorch versions. Tagging @hadim and @rusty1s? Perhaps they have some insight also. I will also confirm when on my linux box. Regardless, I think we are OK to push forward here and leave CI as indicating a failure.

hmacdope

LGTM

kaminow and others added 25 commits August 21, 2023 16:58

Start README file for Combination class.

76d3a65

Finish derivation for MeanCombination.

c878125

Figuring out GitHub markdown

7467b2e

Math for MaxCombination.

06b8983

Add math for BoltzmannCombination.

b88c378

Typo.

cc58766

Begin code rework for new Combination format for MeanCombination.

c121d06

Reset pred and grad trackers after calling predict.

a4cc9ac

Move some common stuff to Combination base class.

e1da2d6

Start rework for MaxCombination.

589d95e

Update some equations for MaxCombination.

84d2de1

Finish code for MaxCombination.

794ba26

Storing weights instead of gradients.

da39e31

Detach grads for safety.

bfc0e45

Fix shapes.

82c327f

Adjust math for numeric stability.

33f0c9a

Update MaxCombination equations to reflect new code.

21230d0

Add handling for model in eval mode.

57cebc5

Rearrange math to avoid taking log of zero/negative gradients.

25d00b4

Update README_COMBINATION math to match MaxCombination.

b993837

Add math for BoltzmannCombination.

77ddd4b

Adjust some math.

fef9d25

Update BoltzmannCombination math.

06354f3

Merge branch 'main' into split-comb-calcs

fe06032

Add str and repr for MaxCombination.

3953ccf

kaminow mentioned this pull request Sep 5, 2023

Update multi-pose training for new version of mtenn asapdiscovery/asapdiscovery#462

Merged

kaminow added 4 commits September 28, 2023 16:38

Rename GroupedModel readout variable to avoid clobbering the readout …

e5ff23c

…for individual predictions.

Return all individual pose predictions in addition to combined predic…

76b985d

…tion.

Split up the different classes into different files.

dd72949

Clean up imports.

f79ffd4

kaminow added 2 commits October 13, 2023 14:59

Add ComplexOnlyStrategy to e3nn model.

d4432b6

Fix docstring.

195d6d5

kaminow marked this pull request as ready for review October 13, 2023 19:56

hmacdope self-requested a review October 16, 2023 21:32

kaminow added 3 commits October 17, 2023 10:11

Update Combination math README.

c3d643d

Add tests for Combination modules.

c482cab

Homogenize Model forward pass return signature with GroupedModel.

08903e3

hmacdope requested changes Oct 18, 2023

View reviewed changes

mtenn/combination.py Outdated Show resolved Hide resolved

mtenn/combination.py Show resolved Hide resolved

mtenn/combination.py Show resolved Hide resolved

Add abc inheritance/decorators for propper polymorphism.

2288017

Make test model smaller and adjust np.allclose tolerances.

ea86511

kaminow requested a review from hmacdope October 18, 2023 21:07

hmacdope approved these changes Oct 18, 2023

View reviewed changes

kaminow and others added 2 commits October 19, 2023 14:39

Merge branch 'main' into split-comb-calcs

1241b91

update testing deps and envfiles

f228b34

Update test_env.yaml

7807cbb

Revert pin

58c43f2

hmacdope approved these changes Nov 1, 2023

View reviewed changes

kaminow merged commit 6f6d8e8 into main Nov 1, 2023
3 of 4 checks passed

kaminow deleted the split-comb-calcs branch November 1, 2023 15:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rework Combination class #26

Rework Combination class #26

kaminow commented Aug 28, 2023 •

edited

Loading

hmacdope left a comment

kaminow commented Oct 18, 2023 •

edited

Loading

codecov-commenter commented Oct 18, 2023 •

edited

Loading

hmacdope left a comment

hmacdope commented Oct 20, 2023

kaminow commented Oct 23, 2023

hmacdope commented Oct 23, 2023

kaminow commented Oct 31, 2023

kaminow commented Oct 31, 2023 •

edited

Loading

hmacdope commented Oct 31, 2023

hmacdope commented Oct 31, 2023

hmacdope commented Oct 31, 2023

hmacdope commented Nov 1, 2023

hmacdope left a comment

Rework Combination class #26

Rework Combination class #26

Conversation

kaminow commented Aug 28, 2023 • edited Loading

hmacdope left a comment

Choose a reason for hiding this comment

kaminow commented Oct 18, 2023 • edited Loading

codecov-commenter commented Oct 18, 2023 • edited Loading

Codecov Report

hmacdope left a comment

Choose a reason for hiding this comment

hmacdope commented Oct 20, 2023

kaminow commented Oct 23, 2023

hmacdope commented Oct 23, 2023

kaminow commented Oct 31, 2023

kaminow commented Oct 31, 2023 • edited Loading

hmacdope commented Oct 31, 2023

hmacdope commented Oct 31, 2023

hmacdope commented Oct 31, 2023

hmacdope commented Nov 1, 2023

hmacdope left a comment

Choose a reason for hiding this comment

kaminow commented Aug 28, 2023 •

edited

Loading

kaminow commented Oct 18, 2023 •

edited

Loading

codecov-commenter commented Oct 18, 2023 •

edited

Loading

kaminow commented Oct 31, 2023 •

edited

Loading