Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Random number weirdness #12

Open
jmschrei opened this issue May 7, 2014 · 5 comments
Open

Random number weirdness #12

jmschrei opened this issue May 7, 2014 · 5 comments

Comments

@jmschrei
Copy link
Owner

jmschrei commented May 7, 2014

I've been having this "bug" for a little bit, wanted to see if anyone else knew about it.

When I write test code, I seed random.seed(0). I will then randomly generate a sequence to test, with the assumption that the sequence will be the same each time, since I set the seed.

Occasionally, what will happen is that the first time I run a program, I will get sequence A, then every other time I will get sequence B, just by rerunning the code. All yahmm operations function appropriately, it's just that the random seed different. If I modify yahmm.pyx in any way (even to add comments), I will get sequence A again, then sequence B every other time.

Any thoughts?

@nipunbatra
Copy link
Contributor

Would you want to try setting the seed using numpy and see if you get the same behavior.

@jmschrei
Copy link
Owner Author

jmschrei commented May 8, 2014

I do set the seed using random. If there were an issue where the seed was changed every iteration, I wouldn't get B constantly after the first trial.

@adamnovak
Copy link
Collaborator

OK, I've been looking at this issue today. I was trying to add support for running the proposed nose tests with python setup.py test as well as through nose directly with nosetests. Depending on which way I ran the tests, I would get different results for the things that depend on random numbers. The two approaches were building the Cython module slightly differently, and producing slightly different .so files, but the differences in the C code were all in obscure macro arguments and didn't look to have much to do with randomness.

My conclusion is that the global state of the random module is the problem, and that it somehow manages to not be properly shared between Python and Cython, amybe in a way that somehow depends on import order. I put a seed call in the actual Cython model sample function, and that alleviated the first-run-after-deleting-the-built-library-vs-other-runs problem for at least one of the test execution methods. But the different methods still gave different results.

I think if we want this to work properly, we need to move away from the Python random module. It might be best to use something that doesn't use global state for the RNG, for that matter.

We could also try making sure that all the functions called in the course of sampling are pure Python, for which we'd probably have to move them outside the .pyx file. This would probably make sampling super slow.

@tlnagy
Copy link
Contributor

tlnagy commented Jul 30, 2014

Would it be possible to stick to rand from stdlib for all of yahmm's random number usage and just add a convenience function to seed this from python (using srand)?

@adamnovak
Copy link
Collaborator

It would be possible, but we'd have to re-work some of the distribution
implementations. We rely on the Python random library's implementations for
sampling from standard things like normal distributions.

On Wed, Jul 30, 2014 at 7:06 AM, Tamas Nagy [email protected]
wrote:

Would it be possible to stick to rand from stdlib for all of yahmm's
random number usage and just add a convenience function to seed this from
python (using srand)?


Reply to this email directly or view it on GitHub
#12 (comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants