Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More flexible parameter initialization. #786

Merged
merged 9 commits into from
Mar 3, 2017

Conversation

null-a
Copy link
Member

@null-a null-a commented Feb 24, 2017

This extends param to take a function specifying how a parameter should be initialized. For example:

param({dims: [2, 1], init: function(dims) { return ones(dims); }})
// => Vector([1,1])

The motivation for this change is that it makes it much easier to add alternate weight initialization strategies to webppl-nn.

It was difficult to implement this new version of param in terms of the old register method as it took a JS function specifying initialization, but here we want to use a webppl function for that. By splitting register out into fetch and create, neither need take a function as an argument, so it's straight forward to work with them from both webppl and JS.

I was nervous about the fact that it's possible to call sample from this initialization function. (This seems like a pretty odd thing to do, since the structure of the model would depend on the state of the parameter table.) And this could conceivably happen through a mistaken attempt to specify a random initialization for a parameter. To guard against this, we apply the function in a coroutine where calling sample or factor generates an error.


This PR also includes a commit that adds some extra run time checks to param. Specifically, we now generate an error when the shape of the parameter value we're about to return does not match the dims argument given to param.

This can happen when a parameter name is reused and the dims arguments don't match up. It could also happen across executions if the program code changes when using a persistent parameter store.

The intention is that by checking for this explicitly, we'll get more informative errors than we'd get if we waited until some later part of the program ran into trouble.

It's possible that this will generate an error in a program that would otherwise work, but I think this is a win overall, since it will be useful to know that the dims in the code reflect the actual shape of the parameter.

@null-a null-a requested a review from stuhlmueller February 24, 2017 14:29
@stuhlmueller
Copy link
Member

Looks good.

Should we document that, to use randomness in init, people need to use dist.sample(), not sample(dist)? I imagine that many custom parameter initialization strategies will need randomness.

@null-a
Copy link
Member Author

null-a commented Feb 27, 2017

Should we document that, to use randomness in init, people need to use dist.sample(), not sample(dist)?

Yeah, probably. Done.

One further thought I had was that we could run these init. functions in something similar to the forward sampling coroutine, so that we can write sample(dist) and still not have a random choice added to the model. Not having to mention dist.sample() seems like a win. (Could cause confusion.)

On the other hand, perhaps we're better sticking with the current approach, where we're effectively reserving the sample(dist) notation until we decide what we want it to mean (#788), which avoids switching its meaning on people in the future?

@stuhlmueller
Copy link
Member

stuhlmueller commented Feb 27, 2017

One further thought I had was that we could run these init. functions in something similar to the forward sampling coroutine, so that we can write sample(dist) and still not have a random choice added to the model.

I like that even better. My hunch is that, whatever meaning we use for sampling during initialization in the context of meta-inference/optimization, it's likely to add up to plain forward sampling in basic contexts. (But if you want to keep using dist.sample(), that's fine with me, too.)

@null-a
Copy link
Member Author

null-a commented Feb 27, 2017

I like that even better.

OK, great, I'll make the necessary changes to this PR.

@ngoodman
Copy link
Contributor

ngoodman commented Feb 28, 2017

it bothers me to have dist.sample used anywhere in user code. this seems to break the abstraction barrier that we've tried to keep in place. maybe add it but leave undocumented until we get the abstractions right? or do only the sample(dist) version?

@null-a
Copy link
Member Author

null-a commented Feb 28, 2017

this seems to break the abstraction barrier that we've tried to keep in place

@ngoodman Couldn't agree more. That's exactly what motivated the idea of doing the work to make sample(dist) work here. That change will be made before this is merged.

null-a added 4 commits March 3, 2017 13:37
Allowing the coroutine to be used without collecting return values in a
distribution.
This allows `sample(dist)` to be used for random initialization, rather
than `dist.sample()`.
@null-a
Copy link
Member Author

null-a commented Mar 3, 2017

I like that even better.

OK, great, I'll make the necessary changes to this PR.

I've completed this.

@stuhlmueller stuhlmueller merged commit 3bd0d8e into probmods:dev Mar 3, 2017
@null-a null-a deleted the param-init branch March 3, 2017 16:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants