You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The example script we provided is not necessarily the best hyperparameter setting (just a hyperparameter setting that can usually get good results so it would be a good starting point for you to try). To reproduce the results, please follow our appendix D.3 for the complete grid search.
Hello,
Thank you for your fantastic work. When I ran the codes for RoBERTa on SST-2, I found the results could not be reproduced.
For instance, for FT (full parameter fine-tune) with Adam, running
gets val_acc = 0.90625, and test_acc = 0.84518
When I try smaller learning rates, the results are worse.
But Table 16 in the paper claims the acc should be 91.9. Is it acc on test_set or val_set?
If it is on test_acc, it seems difficult to reproduce.
The results for MeZO are worse. About eval_acc < 0.8 when I run
Can the authors provide more details on the best hyper-parameters? I would be very grateful for that.
Best regards,
The text was updated successfully, but these errors were encountered: