-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Swaping the irace instances with python objects #29
Comments
I don't understand why this needs to be complicated in this way. Instances are strings (or optionally numbers) and those strings/numbers could represent the names of files or parameters for some generator or keys/index to some dictionary of objects... If the user wants to make them json strings, they can do it themselves. If the user passes as instance something that is not a string nor a number, I would suggest to give a clear error. |
It's easier and more convenient if they can pass plain old / not serializable python objects as instances. For example, in the irace tuning experiment, I am passing the |
But the model is not an instance: it doesn't need to be randomized, nor selected by irace, it always stays the same, so there is no point in passing it through irace as an instance. It should be part of the context available to the target runner, either as a global variable, via a closure, as part of an object or a functor (https://www.stevenengelhardt.com/2013/01/16/python-multiprocessing-module-and-closures/). Perhaps even easier (at least in R) is to use Allowing arbitrary objects as instances means that irace would need to be extra careful with copying or duplicating them (issues with shallow copies vs deep copies), before printing the instances, saving them to disk, etc. Lots of things can go wrong there! Thus, even if there were multiple models (or multiple dataframes or multiple 4D matrices or multiple networkx graphs) and each model was an "instance", it would be better to have the list of models stored as context or |
But different instances of the meta irace use different models (algorithms). Also the training / test instance set split is also part of the instance. I could just pass in a string key which I then lookup with a dictionary in a global varaible but that's more complicated to write.
I think this is similar to that I am doing now. But instead of passing range of integers it just passes the json serialized string https://github.com/DE0CH/iracepy/blob/ffff0b1fea077e997940d8808d1f8bfce0c075f9/src/irace/__init__.py#L158. This makes the log more human readable. |
I guess we have different ideas of what complicated is, because using json to serialize/unserialize python objects into strings seems more complicated to me than using a simple dictionary which may be 2-3 lines of code without extra dependencies. Nevertheless, it seems to me you can still do it the way you want it without having to do it within irace. See below.
If you really want to do it in that way, why not do it before creating the irace object so that the instances you give to irace are already strings? This way iracepy doesn't need to know all these details. and for users whose instances are simple strings or numbers, they don't need json nor serialization and they don't pay the cost of this complexity. You may even add an example or a test of json serialization outside irace so it can be useful to others if they want to do that. |
I think there might be a misunderstanidng. My idea is like the following (python mixed with English pseudocode) def make_target_runner(context):
def target_runner(experiment, configuraiton):
...
if the user provided instances is not a list of strings or numbers:
experiment['instance'] = context['instances'][int(experiment['instance.id']) - 1]
...
return target_runner
...
class Irace:
def __init__(self, scenario, parameters_table, target_runner):
...
if 'instances' in scenario:
if instances is not a list of strings or numbers:
self.scenario['instances'], context['instances'] = np.arrange(len(scenario['instance'])), scenario['instance']
... Essentially, if the user gives us custom objects, we just don't pass it to irace but we just keep track of it in iracepy. If we don't do this, the user has to do this themselves whcih is not as user friendly. On top of this, we can make the log more human readable by using |
Since the number of things that can be converted into r objects and back is limited, I think it would be better if we just get the original python object from the other end using
id.instance
. As for the instances given to irace, I see three options:json.dumps(obj, skipkeys=True, default=lambda x: '<not serializable>')
, which would skip all the non-serializable objects.I would prefer the second option because it's the easiest to implement and human-readable in a log. For any missing data, we'll just ask the user to find the data somewhere else. For the third option, it seems like a lot of effort for not many benefits.
The text was updated successfully, but these errors were encountered: