Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix resetting warning filters after silencing RDT warning about refit #1619

Merged
merged 3 commits into from
Oct 5, 2023
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 4 additions & 3 deletions sdv/data_processing/data_processor.py
Original file line number Diff line number Diff line change
Expand Up @@ -537,9 +537,10 @@ def update_transformers(self, column_name_to_transformer):
"'RegexGenerator' instead."
)

warnings.filterwarnings('ignore', module='rdt')
self._hyper_transformer.update_transformers(column_name_to_transformer)
warnings.resetwarnings()
with warnings.catch_warnings():
msg = rdt.HyperTransformer._REFIT_MESSAGE
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One drawback of pulling the warning message from RDT directly is that if RDT renames the _REFIT_MESSAGE attribute to some other name the code will break. If instead we harcoded msg and the warning message changed from the RDT side, this code would fail to silence the warning but the execution would not fail.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just filter all warnings from the HyperTransformer? Seems like thats what we were doing before (actually more broad since it filtered all of RDT) and that way we don't have to worry about the message

warnings.filterwarnings('ignore', message=msg, module='rdt')
self._hyper_transformer.update_transformers(column_name_to_transformer)

def _fit_hyper_transformer(self, data):
"""Create and return a new ``rdt.HyperTransformer`` instance.
Expand Down
16 changes: 15 additions & 1 deletion tests/unit/data_processing/test_data_processor.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
import re
import warnings
from unittest.mock import Mock, call, patch

import numpy as np
Expand All @@ -7,7 +8,8 @@
from rdt.errors import ConfigNotSetError
from rdt.errors import NotFittedError as RDTNotFittedError
from rdt.transformers import (
AnonymizedFaker, FloatFormatter, IDGenerator, UniformEncoder, UnixTimestampEncoder)
AnonymizedFaker, FloatFormatter, GaussianNormalizer, IDGenerator, UniformEncoder,
UnixTimestampEncoder)

from sdv.constraints.errors import MissingConstraintColumnError
from sdv.constraints.tabular import Positive, ScalarRange
Expand Down Expand Up @@ -1199,6 +1201,18 @@ def test_update_transformers_not_fitted(self):
with pytest.raises(NotFittedError, match=error_msg):
dp.update_transformers({'column': None})

def test_update_transformers_ignores_rdt_refit_warning(self):
"""Test silencing hypertransformer refit warning (replaced by SDV warning elsewhere)"""
metadata = SingleTableMetadata()
metadata.add_column('col1', sdtype='numerical')
metadata.add_column('col2', sdtype='numerical')

dp = DataProcessor(metadata)
dp.fit(pd.DataFrame({'col1': [1, 2], 'col2': [1, 2]}))
with warnings.catch_warnings():
warnings.simplefilter('error')
dp.update_transformers({'col1': GaussianNormalizer()})

def test_update_transformers_for_key(self):
"""Test when ``transformer`` is not ``AnonymizedFaker`` or ``RegexGenerator`` for keys."""
# Setup
Expand Down