-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft: V6 with rmscene #59
base: master
Are you sure you want to change the base?
Conversation
Thanks, Laura! Let me divide your commits into four buckets:
Lines 472 to 473 in 12e1bde
Sounds good? Thanks again! |
I made a PR for the cache annotation.
I'm not 100% sure, no. I relied on PyCharm's inspections. Although I do think it's correct in this case. work_doc, ann_page = process_ocr(work_doc, ann_page) There are no other callers. |
Hi Laura, thanks for your prompt reply. Frankly, I am not using the type folio but it could be that I once hit the type button in the menu. Maybe this is enough to change the coordinate system? I will check myself first with a simple and clean test notebook. In case it does not work, I can also provide the test as a PR. |
Sorry but I have another question and I do not know where to post this.... Have you considered to use the USB web API for converting annotated PDFs and notebooks to PDFs? I have found this library here but have not tried it yet. The advantage would be that RM2 format changes are not important and I would assume that the USB web API is more stable. Moreover, in case of the notebooks the used template would be also part of the PDF. Of course it requires the device to be turned on to do the conversion on the device. For my personal use-case a possible setup could be:
|
If it fits your use-case better, then I suppose the web interface is the way to go. It's not the use-case I'm looking for though. I need something that works with the API. |
Hey @Azeirah, awesome work! I really apreciate that! How is the state of this branch? Could I use it without fear? |
It mostly works pretty well, these two are the largest limitations:
There might also still be undiscovered bugs. Overall it's pretty stable, over 80 users of https://scrybble.ink are using this branch to sync their documents to Obsidian.md |
This pull request should not be merged.
I post this here as a pull request to use Github's diff interface.
I think it's best to merge this step-by-step. My proposal is to create a branch along with a respective issue for each (top-level) point below. Some points are good to merge immediately, so they will take only minutes, others will benefit strongly from further discussion, cleanup and testing so they might take a bit longer.
remarks/conversion/drawing.py
should definitely be turned into a pull request. Sometimes a line has only one point, whereas a segment for PyMuPDF requires at least two points. I currently ignore it, but it might be a better alternative to render a point?get_ann_max_bound
inparsing.py
parsing.py
,remarks.py
andutils.py
. This is very much a work in progress. It renders lines correctly in a couple of scenarios, and close to correctly in some others. Very poorly in others though.process_ocr
function inremarks.py
, it had an unused parameter. I cleaned this up without affecting any behavior whatsoever.@cache
annotation toread_meta_file
inutils.py
. This reduces filesystem reads for the .metadata file to 1. If you use remarks in bulk this does save some time. I like how Python allows you to optimize unoptimally written code this simply :)If you agree with the approach @lucasrla, I think it's best if I make the issues and branches myself since I know what code belongs together.