Training CLI script prompts for unused options which `MultiStepLR` scheduler does not accept #90

raehik · 2023-08-25T15:06:49Z

The neural net's learning rate is configurable in the training step using the CLI parameter --learning_rate <string>, where <string> is a list of epoch-rate pairs in format epoch_1/rate_1/epoch_2/rate_2/.... However, upon some investigation, only the first rate appears to be used. I find it used only during optimizer initialization:

gz21_ocean_momentum/src/gz21_ocean_momentum/trainScript.py

Lines 345 to 346 in 322ee2f

    
           optimizer = optim.Adam(params, lr=learning_rates[0], weight_decay=weight_decay) 
        
           lr_scheduler = MultiStepLR(optimizer, list(learning_rates.keys())[1:], gamma=0.1)

Note that due to how MultiStepLR is configured, all of the training step run configurations I have will run as expected anyway due to how the rates have been decided (they decay by a factor of 10 each step). e.g. for 0/5e-4/15/5e-5/30/5e-6, rates past 5e-4 are ignored, but the unused rates end up being the calculated ones anyway due to the decay factor.

Is this erroneous; should we be able to specify learning rate exactly? Or can we simplify: pass an initial rate, a decay factor, and the epochs to decay at. (This appears to be the pattern MultiStepLR uses.)

The text was updated successfully, but these errors were encountered:

raehik · 2023-08-25T15:07:04Z

ping @arthurBarthe

arthurBarthe · 2023-08-31T13:54:45Z

Discussed yesterday, but adding notes to sum up: to reproduce the paper, we should not be using a scheduler, but instead the fixed learning rate updates. We can achieve this using the MultiStepLR from pytorch. I am happy to take care of that in a new branch from the main one if that sounds reasonable.

raehik · 2023-08-31T14:08:20Z

to reproduce the paper, we should not be using a scheduler, but instead the fixed learning rate updates

MultiStepLR is a scheduler, we are currently using it, and I believe it was used to decay learning rate for the paper model. This issue was regarding an apparent disconnect between the CLI and how the options get used (as it turns out, some options aren't used). We should change the CLI options so you don't have to provide values that don't get used. (Or could simply update help strings as a stopgap.)

Would be happy to review any work you do related to this, feel free to assign/ping/message me. I have some work too, but it's larger scale & I'm in the middle of migrating it from another PR.

arthurBarthe · 2023-08-31T14:11:06Z

Oh I missed that we are currently using MultiStepLR already! Then we only need to provide the initial learning rate, the decay factor (0.1) and the milestones. And we can indeed get rid of the unnecessary extra values. Does that make sense?

raehik · 2023-08-31T14:45:59Z

Right. The two-line snippet above has everything interesting. The decay factor is restricted (not configurable via CLI), the initial learning rate & milestones are recovered from the "learning rate" string. Those should be their own options instead.

raehik · 2023-10-26T15:20:20Z

handled in #95

raehik mentioned this issue Oct 12, 2023

Training step refactor #95

Closed

6 tasks

raehik added this to the 2023-10 wrap-up milestone Oct 26, 2023

raehik mentioned this issue Dec 4, 2023

Refactor data step, inference step, Jupyter notebooks #97

Merged

5 tasks

raehik changed the title ~~Are CLI neural net learning rates properly configured?~~ Training CLI script accepts unused options which the scheduler MultiStepLR does not accept Dec 4, 2023

raehik changed the title ~~Training CLI script accepts unused options which the scheduler MultiStepLR does not accept~~ Training CLI script prompts for unused options which MultiStepLR scheduler does not accept Dec 4, 2023

raehik linked a pull request Dec 4, 2023 that will close this issue

Refactor data step, inference step, Jupyter notebooks #97

Merged

5 tasks

raehik closed this as completed in #97 Dec 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training CLI script prompts for unused options which `MultiStepLR` scheduler does not accept #90

Training CLI script prompts for unused options which `MultiStepLR` scheduler does not accept #90

raehik commented Aug 25, 2023

raehik commented Aug 25, 2023

arthurBarthe commented Aug 31, 2023

raehik commented Aug 31, 2023

arthurBarthe commented Aug 31, 2023

raehik commented Aug 31, 2023

raehik commented Oct 26, 2023

Training CLI script prompts for unused options which MultiStepLR scheduler does not accept #90

Training CLI script prompts for unused options which MultiStepLR scheduler does not accept #90

Comments

raehik commented Aug 25, 2023

raehik commented Aug 25, 2023

arthurBarthe commented Aug 31, 2023

raehik commented Aug 31, 2023

arthurBarthe commented Aug 31, 2023

raehik commented Aug 31, 2023

raehik commented Oct 26, 2023

Training CLI script prompts for unused options which `MultiStepLR` scheduler does not accept #90

Training CLI script prompts for unused options which `MultiStepLR` scheduler does not accept #90