adding support for multimers via residue_index hack #25

sokrypton · 2022-08-14T19:56:14Z

No description provided.

sokrypton · 2022-08-14T21:48:53Z

I'm not sure if I implemented RoPE() residue index offset correctly.

Setting residue index offset for relative-positional embedding works, but if you touch RoPE, it crashes!

Here is what I tried for RoPE():

# default
position = torch.arange(total_length).cuda()

# custom residue index (with 200 offset between chains)
position = residue_index

# repeating residue index
position = torch.arange(subunit_length).repeat(subunits).cuda()

sokrypton · 2022-08-15T00:52:38Z

position = 500 + torch.arange(total_length)

Also works!
So it's not an absolute thing... but it seems you need to preserve the relative encoding for OmegaFold to work.

I'm beginning to suspect that the reason why homo-oligomers work is that there are "fused" examples in the language model training set.

RuiWang1998 · 2022-08-15T02:39:23Z

Hi @sokrypton,

Thanks for this info, we'll take a look into it.

Best

RuiWang1998 · 2022-08-15T05:27:15Z

Hi Dr. @sokrypton,

Would you care to give us an example input for this pr?

sokrypton · 2022-08-15T05:59:37Z

For hetero-dimer:
https://www.rcsb.org/structure/7M5F

>H1065
AKNSLTTKSLFKEMTIQGIKFTPENVVGAAKDNSGKIIFLEKGNSKSGLQHIVEEHGDQFAQIGVSEARIPDVVMKAVTDGKIVGYQGAGAGRPIYETMIDGKKYNIAVTVGSNGYVVGANLRGSVK:MKEIKLMADYHCYPLWGTTPDDFGDISPDELPISLGLKNSLEAWAKRYDAILNTDDPALSGFKSVEEEKLFIDDGYKLAELLQEELGSAYKVIYHADY

For homo-oligomer:

>tmp
PIAQIHILEGRSDEQKETLIREVSEAISRSLDAPLTSVRVIITEMAKGHFGIGGELASK:PIAQIHILEGRSDEQKETLIREVSEAISRSLDAPLTSVRVIITEMAKGHFGIGGELASK

>tmp
PIAQIHILEGRSDEQKETLIREVSEAISRSLDAPLTSVRVIITEMAKGHFGIGGELASK:PIAQIHILEGRSDEQKETLIREVSEAISRSLDAPLTSVRVIITEMAKGHFGIGGELASK:PIAQIHILEGRSDEQKETLIREVSEAISRSLDAPLTSVRVIITEMAKGHFGIGGELASK:PIAQIHILEGRSDEQKETLIREVSEAISRSLDAPLTSVRVIITEMAKGHFGIGGELASK

RuiWang1998 · 2022-08-17T12:58:55Z

Hi Dr. @sokrypton,

We have reproduced your results and agree that it is really peculiar. We are going to take a deeper look into the reason for this.

As for the training set of the language model. We directly take them from Uniref 50 sequences.

sokrypton · 2022-08-17T13:04:16Z

I guess... since the LM was trained on single-chains. There is no reason to expect it to generalize to proteins it hasn't seen before, especially protein multimers. I suspect it sometimes works for protein multimers when the multi-chain protein looks like a multi-domain protein in the LM training set.

RuiWang1998 · 2022-09-18T08:33:02Z

Hi,

We have been looking into this issue and found one problem with RoPE not necessarily regarding this problem, namely that it is symmetric, which may not be what we want and we are phasing it out soon.

But still this problem persists and we do not really have an idea yet.

JackMaguire · 2024-06-18T12:29:38Z

@sokrypton @RuiWang1998 As a hungry user, are the problems with the current branch technical or scientific?

adding support for multimers via residue_index hack

c33659b

sokrypton added 2 commits August 14, 2022 22:32

adding option to enable/disable offset_rope chainbreak offset

7f15061

oops forgot to add as argument

0fdbdc7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

adding support for multimers via residue_index hack #25

adding support for multimers via residue_index hack #25

sokrypton commented Aug 14, 2022

sokrypton commented Aug 14, 2022 •

edited

Loading

sokrypton commented Aug 15, 2022

RuiWang1998 commented Aug 15, 2022

RuiWang1998 commented Aug 15, 2022

sokrypton commented Aug 15, 2022 •

edited

Loading

RuiWang1998 commented Aug 17, 2022

sokrypton commented Aug 17, 2022 •

edited

Loading

RuiWang1998 commented Sep 18, 2022

JackMaguire commented Jun 18, 2024

adding support for multimers via residue_index hack #25

Are you sure you want to change the base?

adding support for multimers via residue_index hack #25

Conversation

sokrypton commented Aug 14, 2022

sokrypton commented Aug 14, 2022 • edited Loading

sokrypton commented Aug 15, 2022

RuiWang1998 commented Aug 15, 2022

RuiWang1998 commented Aug 15, 2022

sokrypton commented Aug 15, 2022 • edited Loading

RuiWang1998 commented Aug 17, 2022

sokrypton commented Aug 17, 2022 • edited Loading

RuiWang1998 commented Sep 18, 2022

JackMaguire commented Jun 18, 2024

sokrypton commented Aug 14, 2022 •

edited

Loading

sokrypton commented Aug 15, 2022 •

edited

Loading

sokrypton commented Aug 17, 2022 •

edited

Loading