Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FIX] Calibration model: Work with numpy data #5159

Merged
merged 2 commits into from
Feb 26, 2021

Conversation

PrimozGodec
Copy link
Contributor

@PrimozGodec PrimozGodec commented Jan 5, 2021

Issue

Calibration models do not support passing NumPy data since their data_to_model_domain does not transform data correctly. It should take into account the base model domain. The error is that the calibration model stores the wrong domain. It first sets the base_learners domain, but it is then overwritten with the wrong (not preprocessed) domain.

This code snipet works with other models but not with callibrations:

data = Table('heart_disease')
base_learner = LogisticRegressionLearner()
model = CalibratedLearner(base_learner)(data)
model(model.data_to_model_domain(data).X)

The correction would support explain the model on calibration learner.

Description of changes

This PR fixes the base learner such that it does not overwrite the domain of models that already set it. It only set the domain for other models (mostly SKL models).

If it looks ugly this error could be also resolved by completely removing the line that set the domain. Then all models would need to pass the domain to the model's constructor. It would be a major change in models and learners.

Includes
  • Code changes
  • Tests
  • Documentation

@PrimozGodec PrimozGodec force-pushed the fix-calibration branch 2 times, most recently from db964ae to dc93080 Compare January 5, 2021 11:22
@codecov
Copy link

codecov bot commented Jan 5, 2021

Codecov Report

Merging #5159 (535547e) into master (2d216ee) will decrease coverage by 0.00%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master    #5159      +/-   ##
==========================================
- Coverage   85.21%   85.21%   -0.01%     
==========================================
  Files         300      300              
  Lines       61002    61010       +8     
==========================================
+ Hits        51985    51988       +3     
- Misses       9017     9022       +5     

@PrimozGodec PrimozGodec marked this pull request as ready for review February 19, 2021 12:05
@janezd janezd self-assigned this Feb 26, 2021
@janezd
Copy link
Contributor

janezd commented Feb 26, 2021

I agree it's not nice, but it works.

@janezd janezd merged commit 7a7c635 into biolab:master Feb 26, 2021
@PrimozGodec PrimozGodec deleted the fix-calibration branch December 20, 2021 14:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants