Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Registry Kwargs Updating Issue #32

Open
sgoldenCS opened this issue Apr 15, 2024 · 0 comments
Open

Registry Kwargs Updating Issue #32

sgoldenCS opened this issue Apr 15, 2024 · 0 comments

Comments

@sgoldenCS
Copy link
Contributor

One of the benefits of using a registry is the ability to register specific versions of modules by providing keyword arguments (kwargs) directly in the __init__.py register() call. As a simple example, we can imagine a MinMaxScaler class will need to know things like axis or feature_range. Since we frequently want to scale between -1 and 1, we may want to register a version of this scaler as id=minus_one_one_scaler where feature_range = [-1,1].

What happens when a user calls make('minus_one_one_scaler', **kwargs)? Well, it depends on how the arguments are passed to the MinMaxScaler. Let's assume that the __init__() function is defined as:

def __init__(self, config=None):
    config = {} if config is None
    self.config = self.get_default_config()
    self.config.update(config)
    ...

The way our registry code is written, we will register the new scaler as:
register(id='minus_one_one_scaler', entry_point='<path_to_MinMaxScaler>', kwargs={'config': {'feature_range'}})

This works well if we are ok with the default behavior for all other configuration parameters, but what happens if I call make('minus_one_one_scaler', config={'axis': None}?

Currently, this will return an object where config = {'axis': None}. Therefore, our minus_one_one_scaler will scale by the default feature_range, which may not be [-1,1]! This is caused by the line _kwargs=_kwargs.update(kwargs) in registration.py which will update the full config dictionary.

There are many possible ways to change this behavior:

  1. Add a registry_config argument to the module __init__() to be used by the registry. Then the module itself can handle any logic to update its config parameters. This is a nice, simple solution, but requires every module we write to include this logic in the __init__().
  2. Add a check in the registry code for a config argument in kwargs during the make() call, and update that dictionary separately. This has the benefit of being transparent to the module (so no extra module code is needed), but means we would have to agree on the key name (or a list of them) that we want to be handled in this special way.
  1. Implement the module __init__() by splitting up config into its associated keys. i.e. def __init__(self, axis=0, feature_range=None, ...). Since these keys are now individual kwargs, they will get appropriately updated by the registry. This has the benefit of no changes to the registry. However, this would mean you couldn't call make('minmax_scaler', config=scaler_config). Instead, your code would either need to be something like make('minmax_scaler', feature_range=[0,1], axis=None) or make('minmax_scaler', **scaler_config). The kwargs passed to register() would also need to be separated.

There are probably other ways to change this as well, but these were the top 3 ideas I could think of after discussing with @dlersch, @Kishanrajput, and @ahmedmohammed107. I am starting to lean towards using idea 3, as it feels somewhat natural to me, but I'm open to anything.

One thing to note: None of these ideas will recursively update the configuration, so any configuration arguments that are dictionaries will still need to be passed in their entirety.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant