You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, Author
Thanks for the high performance pyramid attention you have proposed. However, while I was reviewing the paper I came across several difficulties as follows:
In the appendix of the paper, I can know that the constant A represents the number of adjacent nodes at the same scale that a node can attend to.When the constant A takes the value 3, does it represent the number of neighbor nodes at the same scale as the middle node except for the leftmost and rightmost nodes?Because I think the number of neighboring nodes of the leftmost and rightmost nodes under each scale is 2, right?
When the constant A takes the value of 3, the model diagram of PAM looks like this.
When the constant A takes the value 5, it means that the number of neighboring nodes of each node at the same scale is 5. The model diagram of PAM looks like the following?
If so, then the number of neighboring nodes A of the leftmost two and rightmost two nodes of the sequence data of S=1 can only be 3 and 4, and similarly the number of neighboring nodes A of the leftmost two and rightmost two nodes of S=2 can only be 3 and 4, and the leftmost node and rightmost node A of S=3 can only be 3. But I feel my understanding of the constant A is wrong.I hope the author can give me some pointers.
The text was updated successfully, but these errors were encountered:
Thank you for your interest in our work. Good questions.
Yes, we use A to represent the number of same-scale neighbor nodes that a middle node in the sequence (or, most nodes in the sequence) can attend to. The number of same-scale neighbor nodes that can be attended to by nodes near the leftmost and rightmost in the sequence is less than A. In equations (8) and (12), we take this into account and take the upper bound A to compute complexity.
Yes, the diagram of A=5 is right. Your understanding is right.
Hello, Author
Thanks for the high performance pyramid attention you have proposed. However, while I was reviewing the paper I came across several difficulties as follows:
When the constant A takes the value 5, it means that the number of neighboring nodes of each node at the same scale is 5. The model diagram of PAM looks like the following?
If so, then the number of neighboring nodes A of the leftmost two and rightmost two nodes of the sequence data of S=1 can only be 3 and 4, and similarly the number of neighboring nodes A of the leftmost two and rightmost two nodes of S=2 can only be 3 and 4, and the leftmost node and rightmost node A of S=3 can only be 3. But I feel my understanding of the constant A is wrong.I hope the author can give me some pointers.
The text was updated successfully, but these errors were encountered: