Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions about discriminative_fine_tuning #5

Open
wlhgtc opened this issue Jun 12, 2020 · 2 comments
Open

Questions about discriminative_fine_tuning #5

wlhgtc opened this issue Jun 12, 2020 · 2 comments

Comments

@wlhgtc
Copy link

wlhgtc commented Jun 12, 2020

In Section 5.4.3 " We find that assign a lower learn- ing rate to the lower layer is effective to fine-tuning BERT, and an appropriate setting is ξ=0.95 and lr=2.0e-5."
Compared to the code in https://github.com/xuyige/BERT4doc-Classification/blob/master/codes/fine-tuning/run_classifier.py#L812
Seem that you divide the bert layer into 3 part (4 layers for one part) and set different learning rate for each part.
Some questions about it:

  1. How could the decay factor 0.95 match the number 2.6 in code ?
  2. And the last classify layer seem not be contained , no need to set lr for it ?
@xuyige
Copy link
Owner

xuyige commented Jun 25, 2020

Thank you for your issue!

  1. The number 2.6 was set for the beginning experiments, after that, we use run_classifier_discriminative.py for discriminative fine-tuning.
  2. The link to run_classifier_discriminative.py is https://github.com/xuyige/BERT4doc-Classification/blob/master/codes/fine-tuning/run_classifier_discriminative.py
  3. The classifier layer is contained in run_classifier_discriminative.py.

@wlhgtc
Copy link
Author

wlhgtc commented Jun 28, 2020

Thanks for your reply, I will try it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants