Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

H2O fails to run on dataset in which present ENUM field with only one unique value #16460

Open
varsey opened this issue Dec 13, 2024 · 0 comments
Labels

Comments

@varsey
Copy link

varsey commented Dec 13, 2024

H2O version, Operating System and Environment
ubuntu 24.04.1 LTS
h2o 3.46.0.4

Actual behavior
H2O fails to run on dataset in which present ENUM field with only one unique value.
Running simple script applying isoforest algo on sample dataset fails at predict operation resulting in vague java error

Steps to reproduce
Script to run

import h2o
h2o.init()
file_path = 'dataset.csv'
df = h2o.import_file(f'{file_path}', skipped_columns=None)
isoforest = h2o.estimators.H2OIsolationForestEstimator(seed=42)
isoforest.train(x=list(df.col_names), training_frame=df)
predictions = isoforest.predict(df)

dataset: dataset.csv

Results in the following error:
Screenshot from 2024-12-13 15-28-48

To fix an error just fill empty values in dataset with any string, like in this dataset
dataset-fixed.csv

@varsey varsey added the bug label Dec 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant