-
-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
File Widget: Can't switch column type from categorical to datetime #2974
Comments
The following minimal example file works for me: date-test.csv.zip (.xls or .xlsx work the same). Can you attach a small example of your data? |
File Attached - Orange isn't recognizing any of the date formats attached. I'm able to change the "Numeric" values to date/time but unable to do so when the dates register as "Categorical" values. |
Right. The columns are heuristically marked as categorical due to only some 60 unique values for some 2.5k rows. The real issue here imho is that the widget doesn't allow switching from categorical to datetime. You can force Orange to interpret a column as datetime by prefixing the name with "T#", e.g.
|
+1 on this. An issue was opened a while ago with similar concerns: #1520. |
It seems that the column containing the datetime data defaults to "categorical" (without possibility to change) if the column is non-unique, i.e., if there are duplicate datetimes within that column. |
I had the same issue. The values didn't follow the required datetime format for Orange. Adding a small python script solved the issue. Might not be the cleanest but did the job for me. It takes the input file, creates a new column and sets the value to the correct string format. import datetime
from Orange.data import Domain, Table, TimeVariable
#Make a new domain based on existing columns and adding the extra 'newDate' one
new_domain = Domain(["Fiscal Date", "Column2", TimeVariable.make("newDate")], new_data.domain.class_vars, source=new_data.domain)
#Construct a new table based on the new domain, using inputdata
new_data = Table(new_domain, in_data)
#format the date to align to Orange's requirements
for inst in new_data:
inst[2] = str(datetime.datetime.strptime(str(inst[0]), "%Y-%m-%d"))` Looking at your table, you'll need lines of code to set your column 1, 3 and 7 into new columns. I included your first column in the domain definition as an example, but you'd need to add the rest if you wanted them in the final output |
Hello, Software: Orange V 3.20 I loaded excel file, which include date column. I matched the date format to ISO. Orange file widget could not recognize the date and loaded as Categorical. On the data table, there are strange numbers loaded on Date column. I tried with different files and same is repeating. (Please see below screenshot.) Then, I tried with T# to enforce the date format to the column. Orange did recognize that as date, but, this time the date range was completely different. The dates started in 1970-01-01 hh:mm:ss. Below screenshot. It will be very helpful if someone can help me to get this issue fixed. I am not a programmer, so I can not see the solution in this matter. The file is attached here with. |
@bhavin83012 This is not an issue of Orange, but an issue of Excel. Excel tries to be smart and once it recognizes this is a datetime variable, it reformats is behind the scenes. You need to set the number format to Text to force Excel not to mess with your data. |
This is not solved. File widget's domain editor should enable changing categorical to datetime if possible. |
Hi Ajda, Thank you for your support. |
@NejcDebevec, I looked into it: I suppose the best solution is to change You can do this without any refactoring of |
Orange 3.19.0 'YYYY-MM-DD' was not seen by Orange3 as a datetime for me, despite understanding this is ISO 8601 compliant, if i have understood https://en.wikipedia.org/wiki/ISO_8601 'YYYY-MM-DD HH:MM' did work for me. If Orange3's required datetime format is not ISO 8601, is there any chance please that the requried format be specified precisely and comprehensively at https://orange3-timeseries.readthedocs.io/en/latest/widgets/as_timeseries.html? |
Orange most certainly should and does recognize 'YYYY-MM-DD' as datetime. Please check your entries for mistakes. Timeseries is no longer maintained. The reference for using datetime values can be found at: https://orange-visual-programming.readthedocs.io/loading-your-data/index.html#datetime-format |
The issue when it is not possible to switch to time was solved in #4226. Now all variables that contain string in ISO date-time format are automatically recognized as time variables. So they will become a time variable and not a categorical variable. This way all of them have the option to be switched to categorical and back to time. I discovered a new potential issue here. When I opened data provided by @tbuttle, few variables that before showed the data itself are now strings with excel formulas. Is it intentional? @VesnaT |
Ajda, regarding the recommendation that we should read the doc at: "SORRY / It may have existed at one time, but it's gone now. I'm dealing with the same datetime error, for a csv with the date formated as 05-30-2020. Orange outputs: "TypeError: dtype bool cannot be converted to datetime64[ns]" The doc for the file widget (as an overview, not a drill down) is here (but it deals with datetime in a cursory manner and doesn't offer enough info to overcome the TypeError): https://orange3.readthedocs.io/projects/orange-visual-programming/en/latest/widgets/data/file.html Error/The Solution: I imported a set of Tweets, then exported to csv. The csv had a date format of MM/DD/YEAR, but the ISO 8601 standard (as others noted above) is YEAR-MM-DD. That's the reason for the error when attempting to import the csv into Orange. My solution was to open the csv in Numbers (on Mac) and follow the instructions here for reformatting the date column (the time column doesn't need reformatted if it's in Hours-Minutes-Seconds xx-xx-xx): https://support.apple.com/en-in/guide/numbers/tan23393f3a/mac On that link you'd open the "Date and time" dropdown. Obviously I'm pretty new to Orange and coding as a whole if I have to look up info on changing the format of a date variable! If you're not on Mac, you can edit the spreadsheet by uploading it to GoogleDrive and following the date time reformat instructions here: https://support.google.com/docs/answer/56470?co=GENIE.Platform%3DDesktop&hl=en This might also be a solution (but a far more complicated one): "Python Datetime Tutorial: Manipulate Times, Dates, and Time Spans" |
I had an issue with importing clippings from a newspaper archive site when I wanted to get a histogram of the number of articles binned by year. The site exported publication dates as strings (which is a meta) in a format like this example,
The |
This was further fixed in #5819. |
When loading my CSV or Excel file with a Date field Orange does not recognize or provide the ability to change the field type to DateTime. All date fields come in as Categorical data. I'm trying to do a time series prediction but without a the dates being recognized I'm unable to build out my prediction model. I've tried date in several different formats to no avail.
Please note that the format per Orange's documentation matches what is in my data set. YYYY-MM-DD is the required format and is one of the format's I've tried to get Orange to recognize. Please see documentation & screenshots below.
The text was updated successfully, but these errors were encountered: