-
-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FIX] io: Handle mismatched number of header/data values #3237
Conversation
The issue was about stripping empty columns, however the solution in this PR strips all columns, even those containing data, when there are more items in a row than there are headers. I am not saying that is the wrong approach, but we should know what we are choosing.
I don't mind going with the first, since it is already done here. Comments anyone? @ajdapretnar @astaric |
Hmm... I think it was assumed here that the values will be empty, meaning no column names AND no values. At least that's what happened sometimes with Excel. |
+1 for warning. my vote is for 1. since it is already implemented (and less complicated :)) |
Codecov Report
@@ Coverage Diff @@
## master #3237 +/- ##
==========================================
- Coverage 82.82% 82.81% -0.01%
==========================================
Files 346 346
Lines 59629 59636 +7
==========================================
+ Hits 49385 49390 +5
- Misses 10244 10246 +2 |
Orange/tests/test_io.py
Outdated
table = CSVReader(c).read() | ||
self.assertEqual(len(table.domain.attributes), 2) | ||
self.assertEqual(cm.warning.args[0], | ||
"Columns with no headers were striped.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Striped --> Removed or stripped.
Orange/data/io.py
Outdated
return lst | ||
|
||
# Ensure all data is of equal width in a column-contiguous array | ||
data = [_equal_length([s.strip() for s in row]) | ||
for row in data if any(row)] | ||
data = np.array(data, dtype=object, order='F') | ||
|
||
if strip: | ||
warnings.warn("Columns with no headers were striped.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer "Columns with no headers were removed.", since stripped is a bit Pythonish. Or at least 'stripped'.
9f50b49
to
6e1be0c
Compare
Issue
Fixes #1471
Orange can't load fils where there are more data then header columns
Description of changes
Strip data columns
Includes