Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update RIM.py #3

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Update RIM.py #3

wants to merge 1 commit into from

Conversation

eugenelet
Copy link

This is based on the description given in the paper:
"Based on the softmax values in (4), we select the top k_A RIMs (out of the total K RIMs) to be activated for each step, which have the least attention on the null input,....."

Hence we have to perform softmax prior to the construction of the mask.

This is based on the description given in the paper:
"Based on the softmax values in (4), we select the top k_A RIMs (out of the total K RIMs) to be activated for each step, which have the least attention on the null input,....."

Hence we have to perform softmax prior to the construction of the mask.
@dido1998
Copy link
Owner

Thanks for the PR and for reading the code in detail. Since we only need the indices of the top k_A RIMs for constructing the mask, even if we don't take softmax before computing the mask it will be fine as the indices would be the same as after softmax. Softmax converts a given vector into a probability distribution and the highest value in the vector before softmax will have the highest probability after softmax. Let me know if you still find anything wrong with this or any other discrepancies.

@eugenelet
Copy link
Author

Thanks for going through my PR. Here's my understanding:
The softmax scales to final dimension (input and null) into a probability (local). The search of the top k_A is based on the scaled "inputs". Without going through the softmax first, there might be a difference in magnitude across RIMs (global). Let me know if my understanding is correct.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants