-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add statistical test for regex-guided generation #77
Conversation
… simple regex. It has a frozen seed to avoid flakiness, but there is information in there to make it a proper probabilistic test. It's also possible to reduce the run length by taking fewer samples.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We usually use subdirectories in a way that mirrors package sub-modules, so, to match that, let's go with something like tests/test_[statistics|sampling|...].py
.
Other than that an similarly small things, this tests looks good to add!
… the computation of the exact distributions easier. Tests of the variance and a Kolmogorov-Smirnov test for the distribution has been added. Asserts have been computed for n=250. If we want the test to run faster, we can reduce this number. All tests have been run with n=1000000 to verify the monte carlo error goes to zero. All tests have been run with multiple seeds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like we could generalize some of this code and move it outside of the test function. That would make it easier to separate the Markov and non-Markov tests so that they could be run separately/in parallel (e.g. locally you can use pytest-xdist
and fly through the tests without changes to the code).
This isn't necessary, though, so no worries.
Other than that, we can squash and merge after the above two edits.
I agree this can be generalized. Will save that work for when the next version is added though. |
@brandonwillard Foolishly made the changes myself rather than accepting yours, so I can't merge. Sorry. |
I added a simple test that checks that average length of a regex-guided generation on a simple grammar with a simple language model and a fairly simple regex described in #73 is correct. For simplicity, this has a fixed seed, however there is enough information in the issue to turn it into a stochastic test is someone has the stomach for that.
It would be possible to make this test run faster with fewer Monte Carlo samples. I have verified it on much larger sample sizes and it's consistently within MC-standard error of the correct solution. I have also varied parameters in the language model.
I had to kludge together a
generate
function so as to not depend onoutlines
, but I think I'm using the guides properly.