Considerations about "random state" & controlling randomness #277
smoothdeveloper
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I spent some time with scikit-learn, and the random_state concept in it piqued my interest.
https://scikit-learn.org/stable/common_pitfalls.html#controlling-randomness
In this library, I see functions resorting to random number generation rely on the BCL
System.Random
instance, and I was pondering if this design allows the same abilities as found in scikit-learn (and others in python ecosystem), and I think the answer is yes and no, but in truth, it is no.It is yes for all the components that work as a single, top level, function call, but it won't work for components that are instanciated with "random state" instance, and called repeatedly, because the Random object is 100% opaque and doesn't have a reset function that initialises it to it's initial seed value.
I believe it is important to consider how this is to be tackled, it seems one good bet is that it could be modeled as an interface & a DU that implements it with the usual scenario the library wants to support out of the box.
I'm thinking about an interface because it should be 100% pluggable and F# makes it easy with object expression to make an ad-hoc instance.
Also, it will allow the implementation of the library to not be coupled too tightly to a DU which may evolve and be just one implementation detail in context of the code drawing the random numbers.
We could have extra safety by having an interface that "knows how to reset to the seed" (which would be no-op for RNG that don't support it...) and another which is just for the number sampling.
Overall, having an abstraction rather than direct reliance on BCL
System.Random
seems indicated for this library.I think the best person to draw ideas from would be from those that used statistical packages extensively (not me...).
It may also be worthy to investigate https://github.com/search?q=repo%3Ahaskell%2Fstatistics%20random&type=code
Beta Was this translation helpful? Give feedback.
All reactions