You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
How can we get attention weights from example sequence and structure?
There were no arguments to get attention weights in transformer blocks, unlike esm2.
The text was updated successfully, but these errors were encountered:
Unfortunately, pytorch flash attention doesn't let you do this. You'll have to hack it in, we'll look into support it officially. Here's where the attention is computed, you'll just have use a pytorch implementation of attention to expose the attention matrix.
How can we get attention weights from example sequence and structure?
There were no arguments to get attention weights in transformer blocks, unlike esm2.
The text was updated successfully, but these errors were encountered: