rewrite the mace equivariant head to avoid a model explosion #52

sblackburn86 · 2024-06-03T19:26:33Z

original model:
time embedding + mace output ->
o3.Linear -> o3.BatchNorm -> o3.TensorSquare -> non linearity
The TensorSquare had a lot of weights and was probably overkill for our purpose. Its goal was to mix the time information across the different channels of MACE, but it was also computing a lot of useless stuff.

I replaced with the following:
FullyConnectedTensorProduct(time embedding, mace output) ->
o3.Linear -> o3.BatchNorm -> non linearity
the FCTP mixes time with all the components of the mace output and we can control the dimensionality of its output. No need for a TensorSquare anymore
We probably could simplify even more by allowing only the 0e channel as the output of the FCTP since the next layers should not mix different ells

rousseab

LGTM!

sblackburn-mila added 2 commits June 3, 2024 15:19

rewrite the mace equivariant head to avoid a model explosion

c560292

whiteline and useless variable deleted

3f55d17

sblackburn86 requested a review from rousseab June 3, 2024 19:39

rousseab approved these changes Jun 4, 2024

View reviewed changes

sblackburn86 merged commit eab8b54 into main Jun 5, 2024
1 check passed

sblackburn86 deleted the mace_equivariant_head_with_tensorproduct branch June 5, 2024 12:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rewrite the mace equivariant head to avoid a model explosion #52

rewrite the mace equivariant head to avoid a model explosion #52

sblackburn86 commented Jun 3, 2024

rousseab left a comment

rewrite the mace equivariant head to avoid a model explosion #52

rewrite the mace equivariant head to avoid a model explosion #52

Conversation

sblackburn86 commented Jun 3, 2024

rousseab left a comment

Choose a reason for hiding this comment