-
-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sparse data may cause problems for some preprocessors #1719
Comments
@nikicc Does any of the recent changes address this? Imputation still seems to take a while when connected to Corpus. |
@ajdapretnar they probably do. As of modeling, everything should be ok now. We use But with our imputers, there still might be some problems. |
Related perhaps to #1713. I know a lot of work has been done on sparse since. @BlazZupan Do you have an example we could test against? |
We need better support for sparse data, but this is a bigger issue. |
Orange version
3.3.8 on Windows 7
Expected behavior
Orange should seamlessly propagate sparse data through the pipeline.
Actual behavior
Some preprocessors transform sparse data to dense data and hence clog memory, Orange crashes with MemoryError. An obvious example of such preprocessor is imputation, and is invoked before (any?) learner of scikit.
A typical trace of the error is:
Steps to reproduce the behavior
We have spotted this type of error from bug reports and inferred the cause from error traces.
The text was updated successfully, but these errors were encountered: