You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It divides parameters into groups of type T (e.g. numbers (1, 2, 3) or strings). It should be possible to create arbitrary mappings, but simple constructors for simple mappings should exist. For example, if model is a Chain then it should be possible to group the parameters by index ranges:
model =Chain(layer1, layer2, layer3)
pg =ParamGroups(model, [1:2, 3])
Then if param is a parameter of layer1 or layer2, getgroup(pg, param) == 1.
This ParamGroups type could then be used for different use cases. For discriminative learning rates (where the learning rate is discounted by a factor depending on which group the parameter belongs to), we could then have a composable optimizer that takes ParamGroups (a mapping param -> group) as well as a mapping group -> lrfactor. This could be used with the current sequential Optimiser:
This shows how the parameter groups mappings can be separated from the optimizer implementation; as done here it can be used with the sequencing Optimiser API, but also with a wrapper optimiser or wrapper model.
I've created a basic implementation of parameter groups and a discriminative learning rates optimizer here: paramgroups.ipynb.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
This discussion refers to what was discussed at the biweekly ML Coordination call, the recording of which can be found here: https://www.youtube.com/watch?v=1KbAjt0v0rQ.
While the discussion is fresh in mind, I'll write down some thoughts of the a possible user-facing API to make use of parameter groups.
The basic idea is to have a
ParamGroups
that is a mapping from layer to group:It divides parameters into groups of type
T
(e.g. numbers (1, 2, 3) or strings). It should be possible to create arbitrary mappings, but simple constructors for simple mappings should exist. For example, ifmodel
is aChain
then it should be possible to group the parameters by index ranges:Then if
param
is a parameter oflayer1
orlayer2
,getgroup(pg, param) == 1
.This
ParamGroups
type could then be used for different use cases. For discriminative learning rates (where the learning rate is discounted by a factor depending on which group the parameter belongs to), we could then have a composable optimizer that takesParamGroups
(a mappingparam -> group
) as well as a mappinggroup -> lrfactor
. This could be used with the current sequentialOptimiser
:This shows how the parameter groups mappings can be separated from the optimizer implementation; as done here it can be used with the sequencing
Optimiser
API, but also with a wrapper optimiser or wrapper model.I've created a basic implementation of parameter groups and a discriminative learning rates optimizer here: paramgroups.ipynb.
Beta Was this translation helpful? Give feedback.
All reactions