Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert integer column names to strings to allow for default column names #1976

Merged
merged 31 commits into from
May 8, 2024

Conversation

lajohn4747
Copy link
Contributor

@lajohn4747 lajohn4747 commented May 1, 2024

resolves #1935
CU-86b04ut1e

Handle columns that are integers
This PR allows the base synthesizer to ingest data and convert the columns into strings. When sampling, it reassigns the correct columns so the user does have different columns for synthetic data.

Please note metadata dictionaries columns need to be a string.

@sdv-team
Copy link
Contributor

sdv-team commented May 1, 2024

@lajohn4747 lajohn4747 changed the title WIP trying to convert everything to string before processing Convert integer column names to strings to allow for default column names May 1, 2024
@lajohn4747 lajohn4747 requested review from frances-h and R-Palazzo May 1, 2024 22:47
@lajohn4747 lajohn4747 marked this pull request as ready for review May 1, 2024 22:47
@lajohn4747 lajohn4747 requested a review from a team as a code owner May 1, 2024 22:47
Copy link
Contributor

@R-Palazzo R-Palazzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

Can you add some unit tests?

tests/integration/single_table/test_base.py Show resolved Hide resolved
tests/integration/multi_table/test_hma.py Outdated Show resolved Hide resolved
tests/integration/single_table/test_base.py Outdated Show resolved Hide resolved
tests/integration/single_table/test_base.py Outdated Show resolved Hide resolved
tests/integration/multi_table/test_hma.py Show resolved Hide resolved
tests/integration/single_table/test_base.py Outdated Show resolved Hide resolved
@lajohn4747 lajohn4747 requested a review from R-Palazzo May 3, 2024 22:03
sdv/single_table/utils.py Show resolved Hide resolved
sdv/single_table/base.py Outdated Show resolved Hide resolved
tests/integration/multi_table/test_hma.py Outdated Show resolved Hide resolved
tests/integration/multi_table/test_hma.py Show resolved Hide resolved
sdv/single_table/base.py Outdated Show resolved Hide resolved
@lajohn4747 lajohn4747 requested a review from R-Palazzo May 7, 2024 15:25
Copy link
Contributor

@R-Palazzo R-Palazzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing the comments :)

sdv/multi_table/base.py Outdated Show resolved Hide resolved
sdv/single_table/utils.py Show resolved Hide resolved
sdv/multi_table/base.py Outdated Show resolved Hide resolved
sdv/single_table/base.py Outdated Show resolved Hide resolved
@lajohn4747 lajohn4747 requested a review from frances-h May 8, 2024 15:09
Copy link
Contributor

@frances-h frances-h left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@lajohn4747 lajohn4747 merged commit e1f787e into main May 8, 2024
37 checks passed
@lajohn4747 lajohn4747 deleted the issue_1935_column_integer_type_error branch May 8, 2024 16:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Synthesizers crash when column names are integers (TypeError: unsupported operand)
4 participants