-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More flexible parameter initialization. #786
Conversation
Looks good. Should we document that, to use randomness in |
Yeah, probably. Done. One further thought I had was that we could run these init. functions in something similar to the forward sampling coroutine, so that we can write On the other hand, perhaps we're better sticking with the current approach, where we're effectively reserving the |
I like that even better. My hunch is that, whatever meaning we use for sampling during initialization in the context of meta-inference/optimization, it's likely to add up to plain forward sampling in basic contexts. (But if you want to keep using |
OK, great, I'll make the necessary changes to this PR. |
it bothers me to have dist.sample used anywhere in user code. this seems to break the abstraction barrier that we've tried to keep in place. maybe add it but leave undocumented until we get the abstractions right? or do only the sample(dist) version? |
@ngoodman Couldn't agree more. That's exactly what motivated the idea of doing the work to make |
Allowing the coroutine to be used without collecting return values in a distribution.
This allows `sample(dist)` to be used for random initialization, rather than `dist.sample()`.
This reverts commit 7c4aec0.
I've completed this. |
This extends
param
to take a function specifying how a parameter should be initialized. For example:The motivation for this change is that it makes it much easier to add alternate weight initialization strategies to webppl-nn.
It was difficult to implement this new version of
param
in terms of the oldregister
method as it took a JS function specifying initialization, but here we want to use a webppl function for that. By splittingregister
out intofetch
andcreate
, neither need take a function as an argument, so it's straight forward to work with them from both webppl and JS.I was nervous about the fact that it's possible to call
sample
from this initialization function. (This seems like a pretty odd thing to do, since the structure of the model would depend on the state of the parameter table.) And this could conceivably happen through a mistaken attempt to specify a random initialization for a parameter. To guard against this, we apply the function in a coroutine where callingsample
orfactor
generates an error.This PR also includes a commit that adds some extra run time checks to
param
. Specifically, we now generate an error when the shape of the parameter value we're about to return does not match thedims
argument given toparam
.This can happen when a parameter name is reused and the
dims
arguments don't match up. It could also happen across executions if the program code changes when using a persistent parameter store.The intention is that by checking for this explicitly, we'll get more informative errors than we'd get if we waited until some later part of the program ran into trouble.
It's possible that this will generate an error in a program that would otherwise work, but I think this is a win overall, since it will be useful to know that the dims in the code reflect the actual shape of the parameter.