New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

feat: add support to sklearn TargetEncoder #1137

Open

boccaff wants to merge 6 commits into onnx:main from boccaff:feat-target_encoder_support

boccaff commented Nov 7, 2024

This PR implements a converter and a shape calculator for the TargetEncoder class introduced in Scikit-learn 1.5. The code follows much of the implementation of the converter for Ordinal Encoder.

A partial suit of tests is already implemented, but there is at least a couple of additional tests that I would like to add (missing values and using the smooth parameter from sklearn, even though I think it shouldn't matter).

xadupre reviewed

View reviewed changes

skl2onnx/operator_converters/target_encoder.py Outdated Show resolved Hide resolved

Collaborator

xadupre commented Nov 14, 2024

Thanks for the contribution. One line should be removed. Everything else looks good.

github-advanced-security bot found potential problems

View reviewed changes

skl2onnx/shape_calculators/target_encoder.py Fixed Show resolved Hide resolved

tests/test_sklearn_target_encoder_converter.py Show resolved Hide resolved

tests/test_sklearn_target_encoder_converter.py

+                          [("input", StringTensorType([None, X.shape[1]]))],
+                          target_opset=TARGET_OPSET,
+                      )
+                      self.assertTrue(model_onnx is not None)

Check notice

Code scanning / CodeQL

Imprecise assert Note test

assertTrue(a is not b) cannot provide an informative message. Using assertIsNot(a, b) instead will give more informative messages.

tests/test_sklearn_target_encoder_converter.py

+                          target_opset=TARGET_OPSET,
+                      )
+                      self.assertTrue(model_onnx is not None)
+                      self.assertTrue(model_onnx.graph.node is not None)

Check notice

Code scanning / CodeQL

Imprecise assert Note test

assertTrue(a is not b) cannot provide an informative message. Using assertIsNot(a, b) instead will give more informative messages.

tests/test_sklearn_target_encoder_converter.py

+                          [("input", Int64TensorType([None, X.shape[1]]))],
+                          target_opset=TARGET_OPSET,
+                      )
+                      self.assertTrue(model_onnx is not None)

Check notice

Code scanning / CodeQL

Imprecise assert Note test

assertTrue(a is not b) cannot provide an informative message. Using assertIsNot(a, b) instead will give more informative messages.

tests/test_sklearn_target_encoder_converter.py

+                          target_opset=TARGET_OPSET,
+                      )
+                      self.assertTrue(model_onnx is not None)
+                      self.assertTrue(model_onnx.graph.node is not None)

Check notice

Code scanning / CodeQL

Imprecise assert Note test

assertTrue(a is not b) cannot provide an informative message. Using assertIsNot(a, b) instead will give more informative messages.

tests/test_sklearn_target_encoder_converter.py

+                      model_onnx = convert_sklearn(
+                          model, "ordinal encoder two string cats", inputs, target_opset=TARGET_OPSET
+                      )
+                      self.assertTrue(model_onnx is not None)

Check notice

Code scanning / CodeQL

Imprecise assert Note test

assertTrue(a is not b) cannot provide an informative message. Using assertIsNot(a, b) instead will give more informative messages.

tests/test_sklearn_target_encoder_converter.py Fixed Show resolved Hide resolved

boccaff added 3 commits

November 15, 2024 00:53


          feat: add support to sklearn TargetEncoder

95111ff

Signed-off-by: boccaff <[email protected]>


          fix: removed hardcoded optset and unused import

bb16cff

Signed-off-by: boccaff <[email protected]>


          fix: removed unused variable

e243865

Signed-off-by: boccaff <[email protected]>

boccaff force-pushed the feat-target_encoder_support branch from e945a85 to e243865 Compare

November 15, 2024 00:53

Author

boccaff commented Nov 15, 2024

Thanks for the comments @xadupre. I've removed the line, and solved a couple of the CodeQL suggestions (removed an unused import and an unused variable). The rest of the CodeQL suggested changes would diverge from the other implementations. For the .assertTrue I can just follow the suggestion, but it would diverge from other tests. Is it ok? For the except: pass, maybe we can add a warning (example above on the respective CodeQL comment).


          Merge branch 'main' into feat-target_encoder_support

848e06c

xadupre approved these changes

View reviewed changes

boccaff added 2 commits

November 15, 2024 18:42


          fix: addresed import Errors

28d0991

Signed-off-by: boccaff <[email protected]>


          fix: black and ruff checks

49f2514

Signed-off-by: boccaff <[email protected]>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet