Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reducing complexity in outcome type declaration #13

Closed
simonpcouch opened this issue Apr 30, 2024 · 0 comments · Fixed by #14
Closed

reducing complexity in outcome type declaration #13

simonpcouch opened this issue Apr 30, 2024 · 0 comments · Fixed by #14

Comments

@simonpcouch
Copy link
Contributor

simonpcouch commented Apr 30, 2024

Related to #11 and #12.

Currently in the package, there are several arguments/concepts referring to different interpretations of outcome "type":

  • container(mode): "regression" or "classification" (currently)
  • container(type): "regression", "binary", or "multiclass"
  • adjust_*_calibration(type) (in fit calibrators at fit.container() #12): "linear", "logistic", "multinomial", "beta", "isotonic", or "isotonic_boot", the last two of which could apply to any container(type)
  • The * in above, as in "numeric" or "probability"

Further, many of these can be set at either object creation or at fit() time.

This makes checking quite gnarly and might be more complex than it needs to be. Some questions related to how we might pare back this complexity:

  • Can we wait to know/check anything about types until fit() time? I'm not sure that we do anything with those 4 pieces of information besides catalog them and check their agreement until fit() time, anyway.
  • Can we remove container(mode) altogether? It's a bijection with * above (as in "numeric" or "probability") and can otherwise be inferred from container(type).
  • Should adjust_numeric_calibration() and adjust_probability_calibration() be one function? Whether we're working with numerics or probabilities can be inferred from data or container(type), and this gives us one less thing to check, one less function in the namespace, and one more concept of type for users to wrap their heads around.

So:

  • container(mode) is removed
  • container(type) is kept, perhaps renamed to submode or otherwise
    • Not required (perhaps can't be supplied) until fit() time
  • adjust_*_calibration(type) is kept, perhaps renamed to method or otherwise
  • The * in above is removed, i.e. we refactor both adjust_*_calibration() functions into one adjust_calibration()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant