-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
continued project cannot read xls files #19
Comments
Sorry for the confusion - I tried to answer too fast. For sure I will fix this problem - give me a few days. |
I can generate xlxs files instead of xls files if JXLS will read those. nPowerLogo.gif IwLogo.gif David Gill IntegrityWare, Inc.(http://www.IntegrityWare.com) On 1/10/2015 9:48 AM, bengalo wrote:
|
The code can only read .xls binary files - the xlsx are XML format and totally different. Very sorry for the confusion earlier - I had a DHxlslib project earlier, and stopped supporting it due to Jan taking it over (renaming it and adding more features). DHlibxls to read on Mac/ios, and JXLS to write them. |
Hi David, So, was I correct to use the DHxlsReader, or should I be using the By the way, I found that everything else in the DHxlsReader is very high nPowerLogo.gif IwLogo.gif David Gill IntegrityWare, Inc.(http://www.IntegrityWare.com) On 1/10/2015 3:09 PM, David Hoerl wrote:
|
To add to the confusion is that I named the project 'DHlibxls', but the actual framework is 'DHxlsReader'. You only need JXLS if you want to WRITE Excel files. BTW, did you send me the problematic .xls file yet? |
Hi David, Here is the Excel file. So I will put DHxlsReader back into my By the way, I really appreciate your code. You have done a very nice nPowerLogo.gif IwLogo.gif David Gill IntegrityWare, Inc.(http://www.IntegrityWare.com) On 1/12/2015 12:09 PM, David Hoerl wrote:
|
Hello David, No worries. Isn't the .xlxs format just the newer Excel format? I can produce nPowerLogo.gif IwLogo.gif David Gill IntegrityWare, Inc.(http://www.IntegrityWare.com) On 1/12/2015 11:47 AM, David Hoerl wrote:
|
Hey Dave, I just wondered if you have been able to track down the problem with tables. David Gill |
Unless you can convince me this is a legitimate un-protected file, which all evidence indicates to the contrary, I don't plan to waste my time on it. Numbers refuses to open it, and also saved variants with an additional number added in the column you indicated. I just tried to open it with google documents, and only a few things show up. This file has serious issues. If you want to compensate me for the time needed to delve into the belly of libxls and all the microsoft oddities, I can do it - but I am not going to waste the few hours of time I can devote to open source on this. One of the projects contributors has a production app that reads and imports files using the library. From time to time files show up that cannot be opened - they were created in a buggy point release of Excel or some other program. Excel itself has knowledge of all past oddities, is supported by hundreds of paid programmers, and has a compelling reason to open any file. |
Can you tell me which protection error it is generating (0x86, 0x63, or 0x2F), and where that error is returned? I will try to track it down. I don't understand how the file can be protected because I saved the file by hand (converting it from a .xlsx file) from within Excel and none of the protection flags are on (it's not read-only, there is no password set, protection is turned off, etc.). Perhaps I have a buggy version of Excel, but that seems odd too as it is a standard, off the shelf purchased version. |
It appears from looking at xls_ParseWorkBook (in xls.c) that all of the protection flags are skipped and not processed. Is this correct? |
If you 'down' converted it from an xlsx file, well who knows what Excel does. Unfortunately at this time I only have an old 2002ish version of Excel that only runs on a PowerPC based Mac. My experience has been that when I see protection flags, it means Excel has done something to the data that makes it difficult for libxls to decode. libxls does nothing with the protection flags, but its an indication that the file has been 'protected' in some form by Excel. In fact, we've considered just aborting when we see those flags, to avoid long discussions such as this one. The way libxls gets data is to decode the file first - to extract out all the tables and such - then as it iterates through all sheets and cells, it uses associations as indicated by the particular data cell to link to the data. In some cases with protection, there is no data - Excel itself takes blocks of 'protected data', and decodes it, then puts it in the 'common' accessible areas to be used. This is possibly one reason for the issue here. Another reason for the missing data may be that in down converting xlsx files, it uses some sophisticated technique that libxls is not aware of. What I would do to track this down, and something you can do your self, is to turn debugging way up - 10 or higher, then dump the file. It will be easier if you use libxls directly - I have an Xcode Mac project that I use for this, but it could be anything even command line. You set the debug flag to 10, then dump the file. You can then start adding printf's all over the place in the specific areas you are trying to see, and find out for the particular cell that has no data, where the library is trying to get it from. You can then dump that whole table (linked list, whatever). You can then find the code that creates that table and add printfs there. Use the OpenOffice Excel Format Specification (google it), it helps a lot. You will see that open office also passed over all the protection schemes and objects - that is all are marked as TBD. Early encryption was very simple and easy to circumvent - its possible other .xls readers have implemented decoding (we passed on doing so years ago, as we didn't want to start down that path). This library has been around for years, and has wide adoption. There have been several real bugs that appeared, but the frequency has dropped over time. I have actually had success using Numbers to both create .xls files and read them when others like you submit them - to see what they look like before trying to process them with libxls. That neither Numbers nor Google Sheets can effectively read this file is a huge red flag. If you do find a bug in libxls, or something simple that needs correction so your file can be read, then I'd for sure add it to the source. I just am not in a position currently to burn a day tracking this down. |
You are probably aware that in this project you are using your libxls and the follow-up project by Jan is using xlslib which cannot read xls files at all, right?
While there is no option to read/parse files in Jan's project (yet?), I'd advise to update the README once again to reflect that.
Might save someone (like me) an hour or two ;)
The text was updated successfully, but these errors were encountered: