decouple the lr scheduler and optimizer? #36

hiyyg · 2021-11-01T03:42:14Z

Hi @lessw2020, thanks for the very nice work!
I noticed that in this Ranger21, the optimizer is tightly coupled with the lr scheduler, could you guide me how I can decouple them?

neuronflow · 2021-11-02T10:57:41Z

I would like to second this. A split in ranger optimizer and ranger scheduler would be really cool.

lessw2020 · 2021-11-02T19:00:32Z

Hi @hiyyg and @neuronflow,
Right now you can turn off the built in lr scheduling by turning off both warmup and warmdown:
use_warmup=False warmdown_active=False
that should simply pass through the input lr and not touch it.
Is that what you mean by decouple? Or do you mean having the scheduler seperately programmable (i.e. cosine decay vs we use linear etc).

neuronflow · 2021-11-02T19:32:34Z

Or do you mean having the scheduler seperately programmable (i.e. cosine decay vs we use linear etc).

This is what I initially had in mind. Maybe, just maybe Ranger optimizer should go hand in hand with Ranger scheduler following the standard pytorch conventions?

felipemello1 · 2022-03-10T18:50:37Z

Hi @lessw2020, apparently in this current implementation there is no way to have different parameters learn using different learning rates. Did I get it right?

If this were available, I would love to use it. Two use cases are the following:

Fine tuning a network where layers closer to the head have a higher lr;
My case: I train a graph neural network, and I need the embeddings to have 100x learning rate of the model, but in this current script I cant use the standard pytorch way of doing it:

model_params = [params for name, params in self.model.named_parameters() if name.startswith('emb.') == False]
emb_params = [params for name, params in self.model.named_parameters() if name.startswith('emb.') == True]
optimizer_model = madgrad_wd([{'params': emb_params, 'lr': self.model_config['emb_max_lr']},
                          {'params': model_params, 'lr': self.model_config['model_max_lr']}], weight_decay=self.model_config['wd'])

lessw2020 · 2022-03-11T21:53:25Z

Hi @fmellomascarenhas, @neuronflow and @hiyyg - fully agree with all the points above (decoupled scheduler and parameter groups.
This split between scheduler and optimizer will happen for Ranger22 (the 2022 edition lol).
Should have more info and updates shortly, as we just agreed last night to go ahead with the Ranger22 version.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

decouple the lr scheduler and optimizer? #36

decouple the lr scheduler and optimizer? #36

hiyyg commented Nov 1, 2021

neuronflow commented Nov 2, 2021

lessw2020 commented Nov 2, 2021

neuronflow commented Nov 2, 2021 •

edited

Loading

felipemello1 commented Mar 10, 2022

lessw2020 commented Mar 11, 2022

decouple the lr scheduler and optimizer? #36

decouple the lr scheduler and optimizer? #36

Comments

hiyyg commented Nov 1, 2021

neuronflow commented Nov 2, 2021

lessw2020 commented Nov 2, 2021

neuronflow commented Nov 2, 2021 • edited Loading

felipemello1 commented Mar 10, 2022

lessw2020 commented Mar 11, 2022

neuronflow commented Nov 2, 2021 •

edited

Loading