Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File types changed and will not open and search not finding anything #36

Open
pickleton89 opened this issue Feb 4, 2023 · 18 comments
Open
Assignees
Labels
bug Something isn't working MacOS

Comments

@pickleton89
Copy link

A number of my png files in my attachment folders have .hocr added to name of file. They will not open now. Additionally, when I invoke the OCR search window, it doesn't find anything. I see when opening Obsidian that the indexing counter finishes.

@MohrJonas
Copy link
Owner

Thank you for your issue. Can you confirm that you're using the newest version of obsidian-ocr, which is currently 2.0.0?

@pickleton89
Copy link
Author

pickleton89 commented Feb 5, 2023 via email

@MohrJonas
Copy link
Owner

Alright, thanks for checking.
When you say the files have hocr added to the name, do you mean you have a file x.png and also a file x.png.hocr?

@pickleton89
Copy link
Author

pickleton89 commented Feb 5, 2023 via email

@MohrJonas
Copy link
Owner

Alright, it's good to hear that the original file is still there, was a bit scared that I screwed something up, and it deleted the files 😌
The .hocr files are remnants of an older version of obsidian-ocr that stored the OCR information for x.png in x.png.hocr.
You can either leave them there and ignore them, or simply delete them.
Since version 2.0.0, the information is stored in a SQLite database, called .obsidian-ocr.sqlite, in the root of your vault.

Concerning the problem you described above:
Could you please open the developer console and see if any errors are reported?

@pickleton89
Copy link
Author

pickleton89 commented Feb 5, 2023 via email

@MohrJonas
Copy link
Owner

Unfortunately, I can't seem to see the attached image

@pickleton89
Copy link
Author

Screenshot of Obsidian (2-5-23, 1-50-49 PM)

@pickleton89
Copy link
Author

Sorry about that was replying by email and image didn't come through. I posted it above.

@MohrJonas
Copy link
Owner

Alright, thanks for the image.
Could you please enable Log to file in the settings of obsidian-ocr, restart obsidian and perform the same steps you did to produce the error above.
After that, could you please attach the log file.

@pickleton89
Copy link
Author

pickleton89 commented Feb 7, 2023 via email

@pickleton89
Copy link
Author

I found the file.
obsidian-ocr.log

@MohrJonas
Copy link
Owner

Thank you for the log. I think I have somewhat of an idea what's going on here.
Could you please tell me which os you're using?

@pickleton89
Copy link
Author

I am currently running macOS Ventura 13.2

@MohrJonas
Copy link
Owner

Okay, and could you tell me the output of tesseract -v in your terminal?

@pickleton89
Copy link
Author

tesseract 5.2.0
leptonica-1.82.0
libgif 5.2.1 : libjpeg 8d (libjpeg-turbo 2.1.3) : libpng 1.6.39 : libtiff 4.4.0 : zlib 1.2.11 : libwebp 1.2.4 : libopenjp2 2.5.0
Found NEON
Found libarchive 3.6.2 zlib/1.2.11 liblzma/5.2.9 bz2lib/1.0.8 liblz4/1.9.4 libzstd/1.5.2
Found libcurl/7.86.0 SecureTransport (LibreSSL/3.3.6) zlib/1.2.11 nghttp2/1.47.0

@MohrJonas MohrJonas added bug Something isn't working MacOS labels Feb 9, 2023
@MohrJonas MohrJonas self-assigned this Feb 9, 2023
@MohrJonas
Copy link
Owner

Alright, after looking at the logs I know why it doesn't work, but I don't know why it doesn't work.
This is the code responsible for running tesseract:

...
const execReturn = exec(`tesseract ${this.settings.additionalArguments} "${source}" stdout -l ${this.settings.lang} hocr`);
...

This would (as an example) translate into a command like that:
tesseract "/some/file.png" stdout -l eng

The problem is (as can be seen in the log), that tesseract for some reason misinterprets the command, giving the following error message:

read_params_file: Can't open stdout
read_params_file: Can't open -l
read_params_file: Can't open eng
Error, cannot read input file undefined: No such file or directory
Error during processing.

As can be seen in the error message, tesseract tries to open stdout, -l and eng as files, even though they are just command line arguments, which is quite strange. I can only assume, that there is some sort of problem with the way the file path is handed to tesseract, because when I input the bogus command tesseract some/file path/image.png stdout -l eng, I get a similar error:

read_params_file: Can't open stdout
read_params_file: Can't open -l
read_params_file: Can't open eng

On the other hand, the file path is wrapped with "", which cause tesseract to behave properly again.
Therefore, some more investigation is necessary so sit tight 😊

@pickleton89
Copy link
Author

I ran the path checks as listed below:
(base) [~]$ brew list tesseract /opt/homebrew/Cellar/tesseract/5.2.0/bin/tesseract /opt/homebrew/Cellar/tesseract/5.2.0/include/tesseract/ (12 files) /opt/homebrew/Cellar/tesseract/5.2.0/lib/libtesseract.5.dylib /opt/homebrew/Cellar/tesseract/5.2.0/lib/pkgconfig/tesseract.pc /opt/homebrew/Cellar/tesseract/5.2.0/lib/ (2 other files) /opt/homebrew/Cellar/tesseract/5.2.0/share/tessdata/ (35 files) (base) [~]$ brew list tesseract-lang /opt/homebrew/Cellar/tesseract-lang/4.1.0/share/tessdata/ (162 files) (base) [~]$ brew list imagemagick /opt/homebrew/Cellar/imagemagick/7.1.0-54/bin/Magick++-config /opt/homebrew/Cellar/imagemagick/7.1.0-54/bin/MagickCore-config /opt/homebrew/Cellar/imagemagick/7.1.0-54/bin/MagickWand-config /opt/homebrew/Cellar/imagemagick/7.1.0-54/bin/animate /opt/homebrew/Cellar/imagemagick/7.1.0-54/bin/compare /opt/homebrew/Cellar/imagemagick/7.1.0-54/bin/composite /opt/homebrew/Cellar/imagemagick/7.1.0-54/bin/conjure /opt/homebrew/Cellar/imagemagick/7.1.0-54/bin/convert /opt/homebrew/Cellar/imagemagick/7.1.0-54/bin/display /opt/homebrew/Cellar/imagemagick/7.1.0-54/bin/identify /opt/homebrew/Cellar/imagemagick/7.1.0-54/bin/import /opt/homebrew/Cellar/imagemagick/7.1.0-54/bin/magick /opt/homebrew/Cellar/imagemagick/7.1.0-54/bin/magick-script /opt/homebrew/Cellar/imagemagick/7.1.0-54/bin/mogrify /opt/homebrew/Cellar/imagemagick/7.1.0-54/bin/montage /opt/homebrew/Cellar/imagemagick/7.1.0-54/bin/stream /opt/homebrew/Cellar/imagemagick/7.1.0-54/etc/ImageMagick-7/ (13 files) /opt/homebrew/Cellar/imagemagick/7.1.0-54/include/ImageMagick-7/ (137 files) /opt/homebrew/Cellar/imagemagick/7.1.0-54/lib/libMagick++-7.Q16HDRI.5.dylib /opt/homebrew/Cellar/imagemagick/7.1.0-54/lib/libMagickCore-7.Q16HDRI.10.dylib /opt/homebrew/Cellar/imagemagick/7.1.0-54/lib/libMagickWand-7.Q16HDRI.10.dylib /opt/homebrew/Cellar/imagemagick/7.1.0-54/lib/ImageMagick/ (261 files) /opt/homebrew/Cellar/imagemagick/7.1.0-54/lib/pkgconfig/ (8 files) /opt/homebrew/Cellar/imagemagick/7.1.0-54/lib/ (9 other files) /opt/homebrew/Cellar/imagemagick/7.1.0-54/share/ImageMagick-7/ (3 files) /opt/homebrew/Cellar/imagemagick/7.1.0-54/share/doc/ (332 files) /opt/homebrew/Cellar/imagemagick/7.1.0-54/share/man/ (17 files)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working MacOS
Projects
None yet
Development

No branches or pull requests

2 participants