Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Major] Support Custom Learning Rate Scheduler #1637

Merged
merged 44 commits into from
Aug 30, 2024
Merged

Conversation

ourownstory
Copy link
Owner

@ourownstory ourownstory commented Aug 28, 2024

Changes:

  • Support any learning rate scheduler
  • quantiles stored in model_config instead of train_config
  • improve training progress tracking in model
  • pass config_model to TimeNet
  • introduce finding_lr flag in TimeNet
  • call self.config_train.set_optimizer and self.config_train.set_scheduler in TimeNet configure_optmizer
  • remove train_loader from self in TimeNet
  • create utils_lightning file with find_learning_rate and configure_tuner
  • update test to only invoke lr_finder explicilty
  • create test_save and test_train_config files

Future TODOs:

  • improve scheduler defaults
  • improve what gets logged and progress printing when finding_lr

Copy link

github-actions bot commented Aug 28, 2024

Model Benchmark

Benchmark Metric main current diff
YosemiteTemps MAE_val 0.58159 0.58957 1.37%
YosemiteTemps RMSE_val 0.86935 0.86595 -0.39%
YosemiteTemps Loss_val 0.00044 0.00044 -0.77%
YosemiteTemps MAE 0.94811 0.9641 1.69%
YosemiteTemps RMSE 1.66847 1.69444 1.56%
YosemiteTemps Loss 0.0012 0.00123 2.81%
YosemiteTemps LR nan 8e-05 0.0%
YosemiteTemps time 142.942 140.81 -1.49%
EnergyPriceDaily MAE_val 5.42935 5.39142 -0.7%
EnergyPriceDaily RMSE_val 6.88991 6.65739 -3.37%
EnergyPriceDaily Loss_val 0.02655 0.02483 -6.48% 🎉
EnergyPriceDaily MAE 6.15242 5.93446 -3.54%
EnergyPriceDaily RMSE 8.26192 7.96693 -3.57%
EnergyPriceDaily Loss 0.02739 0.02572 -6.07% 🎉
EnergyPriceDaily LR nan 0.00019 0.0%
EnergyPriceDaily time 40.5681 39.55 -2.51%
AirPassengers MAE_val 30.081 30.8849 2.67%
AirPassengers RMSE_val 30.9826 32.0342 3.39% ⚠️
AirPassengers Loss_val 0.01234 0.01319 6.9% ⚠️
AirPassengers MAE 6.86203 6.49056 -5.41% 🎉
AirPassengers RMSE 8.81789 8.45795 -4.08%
AirPassengers Loss 0.00074 0.00071 -3.82%
AirPassengers LR nan 0.00022 0.0%
AirPassengers time 10.1452 9.43 -7.05% 🎉
PeytonManning MAE_val 0.35447 0.35144 -0.85%
PeytonManning RMSE_val 0.50324 0.50166 -0.32%
PeytonManning Loss_val 0.01796 0.01781 -0.86%
PeytonManning MAE 0.34738 0.34701 -0.11%
PeytonManning RMSE 0.49347 0.4942 0.15%
PeytonManning Loss 0.01461 0.01469 0.53%
PeytonManning LR nan 0.00028 0.0%
PeytonManning time 25.69 24.99 -2.72%
Model training plots

Model Training

PeytonManning

YosemiteTemps

AirPassengers

EnergyPriceDaily

if self.finding_lr:
# Manually track the loss for the lr finder
self.log("train_loss", loss, on_step=False, on_epoch=True, prog_bar=True, logger=True)
self.log("reg_loss", reg_loss, on_step=False, on_epoch=True, prog_bar=True, logger=True)
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

skip reg_loss

self.log("reg_loss", reg_loss, on_step=False, on_epoch=True, prog_bar=True, logger=True)
if self.finding_lr:
# Manually track the loss for the lr finder
self.log("train_loss", loss, on_step=False, on_epoch=True, prog_bar=True, logger=True)
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pass log_args

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check for lr-finder requirements

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

implement better prgress/metrics logging/printing.
e.g. also need to touch time_net.__init__

self.log_args = {
                "on_step": False,
                "on_epoch": True,
                "prog_bar": True,
                "batch_size": self.config_train.batch_size,
            }

loss = loss * self._get_time_based_sample_weight(t=inputs["time"][:, self.n_lags :])
if self.config_train.newer_samples_weight > 1.0:
# Weigh newer samples more.
loss = loss * self._get_time_based_sample_weight(t=inputs["time"][:, self.n_lags :])
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

simplify to only pass first forecast target

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will do another time

lr_scheduler = self.config_train.scheduler(
optimizer,
**self.config_train.scheduler_args,
)

return {"optimizer": optimizer, "lr_scheduler": lr_scheduler}

def _get_time_based_sample_weight(self, t):
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

simplify to compute based only on first forecast target

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

postponed

@ourownstory ourownstory merged commit 4459338 into main Aug 30, 2024
7 of 11 checks passed
@ourownstory ourownstory deleted the custom-lr-scheduler branch August 30, 2024 03:58
@ourownstory ourownstory restored the custom-lr-scheduler branch August 30, 2024 03:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants