Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about optimizer config. #38

Open
EricKani opened this issue Aug 5, 2021 · 2 comments
Open

Question about optimizer config. #38

EricKani opened this issue Aug 5, 2021 · 2 comments

Comments

@EricKani
Copy link

EricKani commented Aug 5, 2021

"paramwise_cfg=dict(custom_keys={'head': dict(lr_mult=10.)}"

Hi, thank you for open-source your code firstly. I have a question about the configuration of the optimizer.
I found there is "decode_head" in your model, not "head" used in 'custom_keys'. Will 'lr_mult=10' takes effect while we training the model?

Thanks~

@EricKani
Copy link
Author

EricKani commented Aug 5, 2021

Got it. I found the corresponding note in DefaultOptimizerConstructor class. Because 'head' is substring of 'decode_head' and 'auxiliary_head', the corresponding parameters will be settled by the 'lr_mult=10'.

  • custom_keys (dict): Specified parameters-wise settings by keys. If
    one of the keys in custom_keys is a substring of the name of one
    parameter, then the setting of the parameter will be specified by
    custom_keys[key] and other setting like bias_lr_mult etc. will
    be ignored. It should be noted that the aforementioned key is the
    longest key that is a substring of the name of the parameter. If there
    are multiple matched keys with the same length, then the key with lower
    alphabet order will be chosen.
    custom_keys[key] should be a dict and may contain fields lr_mult
    and decay_mult. See Example 2 below.

Thanks~

@EricKani
Copy link
Author

EricKani commented Aug 5, 2021

But I found there print nothing when 'recurse=False' (code in DefaultOptimizerConstructor.add_params):

for name, param in module.named_parameters(recurse=False):
print(name)

I also print the content of builded optimizer. There are 363 param_groups in it. And the lr is also modified by 'lr_mult=10'. But I don't know why 'lr_mult=10' takes effect with nothing generated from "for name, param in module.named_parameters(recurse=False):".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant