-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
what is meant by implicit parameter updates? #2
Comments
Hi, By "implicit parameter updates" I meant layer parameters (buffers) that are not trained with backpropagation but either updated during a forward pass or somehow else. E.g. Batch Normalization layers update average batch statistics at each forward pass. Gradient checkpointing performs two forward passes and hence updates batchnorm statistics twice per step that might be undesirable behavior. Therefore, one should be aware of such side effects and handle them. As I know, LSTM doesn't have any of such moments by default. |
thanks a lot for explanation and reply. I ran the example notebook, it did work! I frozen most of my model layers by param.requires_grad = False.
when i ran the this line.
I received this error. I wonder why? Many thanks in advance for feedback.
|
Sorry, I think it requires more involvement to solve the problem. However, my guess that you should specify a proper callable "get_parameters" argument for maml that accounts for the frozen parameters. By default, it uses nn.Module.parameters that returns all model parameters including the frozen ones and hence maml fails when operates on them. Also, to the best of my knowledge, pytorch maml (not only this implementation) doesn't work with nn.LSTM because it utilizes cudnn operations that don't support a second backward. This point unlikely causes the problem above, but one can face it at the next steps if you use nn.LSTM in your model. FYI, when I applied maml to language models with LSTM layers, I had to use nn.LSTMCell in the loop. |
many thanks for feedback. acutally, I am learning MAML. I am confused on "second backward". according to here.
passed training set to model0, got the loss1, apply loss1, it became model1, passed validation set to model1. obtained the loss2.
I thought we just use the loss2 to update the model0. But I am confused on this statement, what was the exact step in pytorch to perform this operation? many thanks if you could provide some hints on the statement. |
Hi,
I am interested to try your solution.
but I am not familiar to the concept of "implicit parameter updates".
could you explain more on what is this meant and how to handle it?
I am working on a model with LSTM layer, I wonder if LSTM has the implicit parameter updates?
Thanks for advice.
The text was updated successfully, but these errors were encountered: