Draft: V6 with rmscene #59

Azeirah · 2023-03-06T20:37:59Z

This pull request should not be merged.

I post this here as a pull request to use Github's diff interface.

I think it's best to merge this step-by-step. My proposal is to create a branch along with a respective issue for each (top-level) point below. Some points are good to merge immediately, so they will take only minutes, others will benefit strongly from further discussion, cleanup and testing so they might take a bit longer.

Depend on rmscene
- Ideally, we'd depend on it via PyPi and not locally. RMScene looks to be available on PyPi https://pypi.org/project/rmscene/
Poetry.toml should be merged by hand. I believe the only change I made was to add rmscene as a dependency, my merge uses more outdated packages compared to upstream master (lucasrla/remarks)
The commit in remarks/conversion/drawing.py should definitely be turned into a pull request. Sometimes a line has only one point, whereas a segment for PyMuPDF requires at least two points. I currently ignore it, but it might be a better alternative to render a point?
- Exact same scenario occurs in get_ann_max_bound in parsing.py
Parsing and rendering .rm v6 documents in parsing.py, remarks.py and utils.py. This is very much a work in progress. It renders lines correctly in a couple of scenarios, and close to correctly in some others. Very poorly in others though.
Tests are probably not a good idea to directly copy, because they tend to contain private or copyrighted material.
There are a couple of tiny refactors included in the code such as the changes to the process_ocr function in remarks.py, it had an unused parameter. I cleaned this up without affecting any behavior whatsoever.
I added an @cache annotation to read_meta_file in utils.py. This reduces filesystem reads for the .metadata file to 1. If you use remarks in bulk this does save some time. I like how Python allows you to optimize unoptimally written code this simply :)

If you agree with the approach @lucasrla, I think it's best if I make the issues and branches myself since I know what code belongs together.

lucasrla · 2023-03-08T14:39:57Z

Thanks, Laura!

Let me divide your commits into four buckets:

Immediate merges

If you submit "Add: Reduce number of file reads" (2583378) as a PR, I'll merge it right away

More info and testing needed

Are you sure "Refactor: Remove unnecessary arg from ocr" (bd94728) doesn't break anything? I think the reference had to be updated (but my memory could be wrong):

remarks/remarks/remarks.py

Lines 472 to 473 in 12e1bde

    
           # Update ann_page reference to the OCRed page 
        
           ann_page = work_doc[0]

Not necessary anymore

"Fix error where NoneType has no len()" (61d5fa1) has already been merged upstream via f3550d2

Work towards supporting v6 .rm files (reMarkable >= 3.0)

I'll first update one of my devices to v3.2. Then I'll have a look at your commits and start a separate branch to receive PRs related to v6 .rm files.

Sounds good?

Thanks again!

lucasrla · 2023-03-08T17:58:25Z

Voilà: https://github.com/lucasrla/remarks/tree/dev-lines-v6

Azeirah · 2023-03-08T18:12:40Z

If you submit "Add: Reduce number of file reads" (2583378) as a PR, I'll merge it right away

I made a PR for the cache annotation.

Are you sure "Refactor: Remove unnecessary arg from ocr" (bd94728) doesn't break anything? I think the reference had to be updated (but my memory could be wrong):

I'm not 100% sure, no. I relied on PyCharm's inspections.

Although I do think it's correct in this case. ann_page gets passed to process_ocr. It is not used in the body. Then, ann_page is redefined to be work_doc[0] and lastly gets returned. In the caller-site, the original ann_page reference is overwritten by this line:

work_doc, ann_page = process_ocr(work_doc, ann_page)

There are no other callers.

…scene

…er width

wittmeis · 2023-07-30T15:04:48Z

Hi Laura, thanks for your prompt reply.

Frankly, I am not using the type folio but it could be that I once hit the type button in the menu. Maybe this is enough to change the coordinate system?

I will check myself first with a simple and clean test notebook. In case it does not work, I can also provide the test as a PR.

wittmeis · 2023-07-31T19:39:48Z

Sorry but I have another question and I do not know where to post this....

Have you considered to use the USB web API for converting annotated PDFs and notebooks to PDFs?

I have found this library here but have not tried it yet. The advantage would be that RM2 format changes are not important and I would assume that the USB web API is more stable. Moreover, in case of the notebooks the used template would be also part of the PDF. Of course it requires the device to be turned on to do the conversion on the device.

For my personal use-case a possible setup could be:

rsync for creating a backup of the device
a Python tooling that checks for updated files and triggers the re-conversion of these files using the USB API

Azeirah · 2023-07-31T20:30:59Z

Sorry but I have another question and I do not know where to post this....

Have you considered to use the USB web API for converting annotated PDFs and notebooks to PDFs?

I have found this library here but have not tried it yet. The advantage would be that RM2 format changes are not important and I would assume that the USB web API is more stable. Moreover, in case of the notebooks the used template would be also part of the PDF. Of course it requires the device to be turned on to do the conversion on the device.

For my personal use-case a possible setup could be:
* rsync for creating a backup of the device

* a Python tooling that checks for updated files and triggers the re-conversion of these files using the USB API

If it fits your use-case better, then I suppose the web interface is the way to go. It's not the use-case I'm looking for though. I need something that works with the API.

torbenkeller · 2023-10-15T17:21:04Z

Hey @Azeirah, awesome work! I really apreciate that! How is the state of this branch? Could I use it without fear?

Azeirah · 2023-10-15T17:30:11Z

Hey @Azeirah, awesome work! I really apreciate that! How is the state of this branch? Could I use it without fear?

It mostly works pretty well, these two are the largest limitations:

Does not output OCRed text very well or in some cases at all (working on this)
Annotated pages are a lot larger than PDF pages so the file looks inconsistent

There might also still be undiscovered bugs.

Overall it's pretty stable, over 80 users of https://scrybble.ink are using this branch to sync their documents to Obsidian.md

Azeirah added 17 commits December 22, 2022 18:38

Add: Reduce number of file reads

2583378

Add: start working on v3

1f380a8

Fix error where NoneType has no len()

61d5fa1

Change no scribbles/highlights found into "warnings" rather than errors

04d149a

Add: Temporarily skip v6 .rm drawing and parsing

a3867d6

Add: Handle edge case where .rm lines have only one point (requiring 2)

743d259

Add: Rmscene dependency

525bd18

Refactor code a bit for rmscene v6 parsing

f38d9c9

Add: Meta

4cc526d

Fix: Don't include deleted pages in cPages for v3

96c1917

Refactor: Remove unnecessary arg from ocr

bd94728

Add: Book test

4431937

Add: Partial .rm v6 support

941bb99

Hack to print all layers in v3

afe59a3

Improve rmscene importing

19795c3

Merge branch 'upstream-master'

89518ee

Merge branch 'master' into v6_with_rmscene

812e320

Azeirah marked this pull request as draft March 6, 2023 20:38

Azeirah mentioned this pull request Mar 6, 2023

Upgrade to reMarkable 3.0 #58

Open

Azeirah added 8 commits March 12, 2023 15:57

Merge branch 'upstream-master' into v6_with_rmscene

989a8e7

Update to rmscene v3

9ff4def

Add rmscene v3 as dependency

4065d4e

Batch annotations per tool for increased performance

ad98c1a

Merge branch 'improve_annotation_drawing_performance' into v6_with_rm…

596f0e7

…scene

Replace exponentially scaling eraser width with linearly scaling eras…

5d5daf7

…er width

Force eraser to always have a color_code of white

fc0f215

Update rmscene to 0.4.0

cf82278

Azeirah added 21 commits August 9, 2023 23:14

Add glyphrange-based highlighter

0bcda70

Add black

13eb00b

Add missing syrupy to lock file

6d50256

depend on my prod release of rmscene rather than rmscene main

0962cec

httpsify link

93abf89

Refactor logic into Document class

dbf8c92

refactor: Move Document to own file

4da4e1c

Add Obsidian md output

f9380d7

remove lingering print

a236f9f

Cleanup code

8fde424

Refactor: Move obsidian markdown file class to own file

4d6be15

Update remarks version

0b4941b

Add warning callout

8edc1ac

Only generate obsidian markdown files if there's content in it

4b76983

Don't show page header if no pages have highlights

eaac1c4

version p

b3740dc

Fix pyproject rmscene dependency syntax

8b357db

Filter highlights that are not visible on the page

831c436

update rmscene

01171f6

Update lock

92a6064

Add updated rmscene handling an unexpected subblock

db7b2eb

benneti mentioned this pull request Jan 24, 2024

rmfuse: fix build NixOS/nixpkgs#283397

Merged

13 tasks

Azeirah added 2 commits January 31, 2024 16:39

Fix typo on homepage

05a46fa

Fix page index issue in obsidian output

34952c3

j6k4m8 mentioned this pull request Oct 15, 2024

would love to collab on some email functionality! Azeirah/Scrybbling-together#1

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Draft: V6 with rmscene #59

Draft: V6 with rmscene #59

Azeirah commented Mar 6, 2023 •

edited by lucasrla

Loading

lucasrla commented Mar 8, 2023 •

edited

Loading

lucasrla commented Mar 8, 2023

Azeirah commented Mar 8, 2023

wittmeis commented Jul 30, 2023

wittmeis commented Jul 31, 2023 •

edited

Loading

Azeirah commented Jul 31, 2023

torbenkeller commented Oct 15, 2023

Azeirah commented Oct 15, 2023

Draft: V6 with rmscene #59

Are you sure you want to change the base?

Draft: V6 with rmscene #59

Conversation

Azeirah commented Mar 6, 2023 • edited by lucasrla Loading

lucasrla commented Mar 8, 2023 • edited Loading

lucasrla commented Mar 8, 2023

Azeirah commented Mar 8, 2023

wittmeis commented Jul 30, 2023

wittmeis commented Jul 31, 2023 • edited Loading

Azeirah commented Jul 31, 2023

torbenkeller commented Oct 15, 2023

Azeirah commented Oct 15, 2023

Azeirah commented Mar 6, 2023 •

edited by lucasrla

Loading

lucasrla commented Mar 8, 2023 •

edited

Loading

wittmeis commented Jul 31, 2023 •

edited

Loading