-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Validating data types #194
Comments
@siddharth-krishna what do you think? |
Sure, perhaps add a field to times-info.json that specifies the data type of each parameter/set? Ideally we should use the correct dtypes for pandas column from the time we read in the excel tables: #47 |
Parameters (value columns) are always numeric, e.g. double. :-) Their indices can vary, but string generally works. Sets are similar to indices of the parameters. |
Does it not matter if a year is written out as |
@siddharth-krishna In the decimal system, it does not matter if you add zero decimals to a numeric value, onto the right side, right of the decimal point. The values are by definition identical. GAMS complies with the decimal system with respect to inputting parameter values in text files. GAMS GDX files may use a binary representation. However, indexes, i.e. the members of sets are identified by labels, which are basically case insensitive strings. The years indexes are thus also identified by set element labels. and a string "2020.0" is different from the string "2020". Ahh..., I now realize you probably mean Excel and not GAMS? If a year is represented as a numerical cell in Excel, I think you can just remove any zero decimal fractions (I tested with VEDA2 v3.0 and it does accept any number with zero decimal fractions displayed for years, but raises error at non-zero decimals). But if it is a text value, you should keep it as is, and invalidate e.g "2020.0". You can easily test a cell data type in Excel. Text values in year columns may include also year ranges (e.g. 2020-2050) and comma-separated lists of years. |
If it helps, I've been using pandera for dataframe type (and range etc) validation, and it's pretty good. Might be worth a look for this. |
Looks like it may be what we need. Should we be applying it to inputs (i.e. |
Currently the tool may produce invalid input data for GAMS, because it doesn't verify data types. Should we introduce such a verification before export to GAMS?
The text was updated successfully, but these errors were encountered: