-
Notifications
You must be signed in to change notification settings - Fork 481
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Minor] Enable continuation of training #1605
base: main
Are you sure you want to change the base?
[Minor] Enable continuation of training #1605
Conversation
@Constantin343 "The last learning rate is used as a starting point for the new scheduler." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like you got a first version running - good work!
I left a few comments aiming to robustify the user settings and to encapsulate code in a similar manner to existing practices.
Furthermore, I think we should avoid hard-coding the optimizer and scheduler for this functionality. I suggest you make them optionally user-settable (with reasonable defaults). And if so, then it may make sense to do the same for the regular optimizer and scheduler in the same manner.
neuralprophet/forecaster.py
Outdated
@@ -1067,6 +1067,10 @@ def fit( | |||
|
|||
if self.fitted is True and not continue_training: | |||
log.error("Model has already been fitted. Re-fitting may break or produce different results.") | |||
|
|||
if continue_training and self.metrics_logger.checkpoint_path is None: | |||
log.error("Continued training requires checkpointing in model.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please explain what necessitates this (for my understanding)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My thinking was that it makes sense to continue from the checkpoint, but probably it's not necessary. All the necessary parameters should still be available in the model itself. I will adapt it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did some testing and it seems like checkpoint is indeed necessary to correctly continue training with the pytroch-lighting trainer. I can get it to run without checkpointing but fitting again always leads to a complete restart of the training.
Maybe there is some workaround, but I would suggest keeping it like this as continued training always goes hand in hand with checkpointing in pytroch-lighting.
Yeah, the learning rate at the end of the cycle (actually from the last checkpoint, to be precise). The idea was to ensure a smooth transition |
Totally agree. I tried again to make it work with the OneCycleLR scheduler, but it just doesn't seem to work with this scheduler. I will try to make some adaptations over the weekend to accept |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1605 +/- ##
==========================================
- Coverage 88.27% 88.09% -0.18%
==========================================
Files 41 41
Lines 5364 5428 +64
==========================================
+ Hits 4735 4782 +47
- Misses 629 646 +17 ☔ View full report in Codecov by Sentry. |
🔬 Background
Enable to continue the training of a model as described in #828.
Also needed to address #1542
🔮 Key changes
Load model state from the last checkpoint and continue training with a new scheduler. The last learning rate is used as a starting point for the new scheduler.
ExponentialLR is used as new scheduler since continuing with OneCycleLR leads to issues. @ourownstory please let me know if you have an idea for a specific scheduler that make sense here.
📋 Review Checklist
Please make sure to follow our best practices in the Contributing guidelines.