Add function for cross-validation of bias parameters #2

connormayer · 2021-07-10T19:12:23Z

This will involve splitting the data into training and validation sets and comparing parameter values.

connormayer · 2021-08-27T19:21:57Z

Do we sample tokens or types? Tokens maybe give a better approximation of actual acquisition. Maybe option for both. Type sampling is more straightforward to implement.

adelrtan · 2022-10-11T06:53:17Z

Just a thought: Maybe make this function available to compare different temperature values too!

Or perhaps even different combinations of bias & temperature values!

adelrtan · 2022-10-11T07:00:30Z

More thoughts: Perhaps apply the softmax function (similar to what you did for AIC/BIC/AIC-C weights -- I thought that was really cool!) to quantify the conditional probability of the different hyperparam values being the best ones.

adelrtan · 2022-10-11T07:33:13Z

Do we sample tokens or types? Tokens maybe give a better approximation of actual acquisition. Maybe option for both. Type sampling is more straightforward to implement.

Yeah, I think it'll be great to have both options.

Re token sampling: I discovered the utility of the sample function while writing the code for monte_carlo.R. This function makes random draws according to a probability distribution.
We just need to create a probability distribution over input-output pairs, and make random draws based on this distribution.
Update the new frequencies for the train & validation sets, then delete any resulting "empty" tableaux (i.e. tableaux with 0 tokens) from each set.

connormayer · 2022-11-09T05:28:54Z

A few thoughts about your comments @adelrtan:

does it make sense to do cross-validation on the temperature parameter? I think the idea with this parameter is that wug tests tend to be less categorical in a way that's (perhaps) independent of the grammar. Fitting the temperature value to non-wug data seems to contradict this. If the user wants to find the temperature value that works best for a wug data set, it's easy enough for them to do that by looping over possible values. We could add a function that does this, but it doesn't seem high priority.
The cross-validation I added is for tokens rather than types. I'm not sure whether type-based cross-validation really makes sense, but we should talk about it.
Adding the softmax function for cross-validation is a cool idea, but I'm unsure if it can be interpreted in the same way as the softmax of AIC/BIC weights are.

connormayer self-assigned this Nov 9, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add function for cross-validation of bias parameters #2

Add function for cross-validation of bias parameters #2

connormayer commented Jul 10, 2021

connormayer commented Aug 27, 2021

adelrtan commented Oct 11, 2022

adelrtan commented Oct 11, 2022

adelrtan commented Oct 11, 2022

connormayer commented Nov 9, 2022

Add function for cross-validation of bias parameters #2

Add function for cross-validation of bias parameters #2

Comments

connormayer commented Jul 10, 2021

connormayer commented Aug 27, 2021

adelrtan commented Oct 11, 2022

adelrtan commented Oct 11, 2022

adelrtan commented Oct 11, 2022

connormayer commented Nov 9, 2022