-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
model(X, labels=Y, return_dict=True).loss
is wrong
#133
Comments
we need some performance test to catch issues like this in the future |
Don't we still want to pass it seq_len+1? If it's converting it to internally we still get 512 positions on inputs of length 513, right? |
I agree with jai, we don't need to change the tokenizer. @jaidhyani could you make a PR to fix this loss? |
Already merged it a few days ago. Still need to add performance tests though
…On Mon, Apr 29, 2024, 5:23 AM Goncalo Paulo ***@***.***> wrote:
I agree with jai, we don't need to change the tokenizer. @jaidhyani
<https://github.com/jaidhyani> could you make a PR to fix this loss?
—
Reply to this email directly, view it on GitHub
<#133 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAC5BYY5CGNHHIGJZV2XEF3Y7Y3VZAVCNFSM6AAAAABG4GQMJOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOBSGU4TGMZZHE>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
I'm not sure what you're linking to, probably line numer shifted with some new commits on main. I believe it'll compute logits for all input ids, and then discard logits for last position when computing loss. It's not a big deal, but considering this I think inputs should be seq_len. LMK if you have strong takes |
it should be
X, labels=X
ideally we would force it to do what it was supposed to, instead of shifting tokens on it's own
but if we can't we need to adjust the design of tokenization script to produce sequences of seq_len (512) instead of seq_len+1 (513)
The text was updated successfully, but these errors were encountered: