Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Positional encoding #16

Open
sverdoot opened this issue Dec 21, 2023 · 1 comment
Open

Positional encoding #16

sverdoot opened this issue Dec 21, 2023 · 1 comment

Comments

@sverdoot
Copy link

In the original paper the authors suggest adding positional encodings to speech and text representations before the transformer block. I noticed that in your code positional encodings are commented. Have you tried to train model with positional encodings and ,if so, is there any difference in performance?

@p0p4k
Copy link
Owner

p0p4k commented Dec 21, 2023

I changed the implementation slightly there. The authors use an encoder only transformer and so they needed to add different positional embeddings for the text and speech, while I use a full encoder decoder model (which internally uses different set of pos embeddings). Doing anyway is alright and results depend on your training data and other factors. It's just your preference, I leave it as comment for anyone to use it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants