-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
de-randomize benchmarks #2
Comments
FIXED with a0cce4f |
I am reopening this issue, because right now, evaluating the same configuration twice in a row will return different performances, because the RNG is instantiated in init(). I would propose to add a rng also to the objective function to assure a deterministic benchmark if necessary and initialize it with NONE to use the class-RNG. @aaronkl: Any thoughts on that as you fixed the previous issue? |
When building crossvalidation benchmarks, where we want to evaluate one fold at a time, the current workflow would evaluate different subset of datapoints each time the objective function is called (see explanation above). I propose to move the seed from init() to objective_function()/objective_function_test() such that every function evaluation gets a seed. @aaronkl @mfeurer Would you mind if we change that? Or do you have a simpler solution? |
@aaronkl would be okay as long as it is also possible to not specify a seed/rng. In these cases the rng created in init() will be used. |
Every benchmark should accept a rng (no matter whether it uses it). Currently, no benchmark accepts a seed or rng and therefore randomly shuffles data and creates models:
Examples
https://github.com/automl/HPOlib2/blob/master/hpolib/benchmarks/ml/svm_benchmark.py#L41
https://github.com/automl/HPOlib2/blob/master/hpolib/benchmarks/ml/fully_connected_network.py#L103
https://github.com/automl/HPOlib2/blob/master/hpolib/benchmarks/ml/fully_connected_network.py#L103
Default should be "None", which means, that the rng will be instantiated at random.
The text was updated successfully, but these errors were encountered: