Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft: V6 with rmscene #59

Draft
wants to merge 59 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
59 commits
Select commit Hold shift + click to select a range
2583378
Add: Reduce number of file reads
Azeirah Dec 22, 2022
1f380a8
Add: start working on v3
Azeirah Dec 22, 2022
61d5fa1
Fix error where NoneType has no `len()`
Azeirah Dec 26, 2022
04d149a
Change no scribbles/highlights found into "warnings" rather than errors
Azeirah Dec 26, 2022
a3867d6
Add: Temporarily skip v6 .rm drawing and parsing
Azeirah Dec 26, 2022
743d259
Add: Handle edge case where .rm lines have only one point (requiring 2)
Azeirah Dec 26, 2022
525bd18
Add: Rmscene dependency
Azeirah Jan 20, 2023
f38d9c9
Refactor code a bit for rmscene v6 parsing
Azeirah Jan 20, 2023
4cc526d
Add: Meta
Azeirah Jan 20, 2023
96c1917
Fix: Don't include deleted pages in cPages for v3
Azeirah Jan 20, 2023
bd94728
Refactor: Remove unnecessary arg from ocr
Azeirah Jan 20, 2023
4431937
Add: Book test
Azeirah Jan 21, 2023
941bb99
Add: Partial .rm v6 support
Azeirah Jan 21, 2023
afe59a3
Hack to print all layers in v3
Azeirah Jan 21, 2023
19795c3
Improve rmscene importing
Azeirah Mar 6, 2023
89518ee
Merge branch 'upstream-master'
Azeirah Mar 6, 2023
812e320
Merge branch 'master' into v6_with_rmscene
Azeirah Mar 6, 2023
989a8e7
Merge branch 'upstream-master' into v6_with_rmscene
Azeirah Mar 12, 2023
9ff4def
Update to rmscene v3
Azeirah Mar 22, 2023
4065d4e
Add rmscene v3 as dependency
Azeirah Mar 22, 2023
ad98c1a
Batch annotations per tool for increased performance
Azeirah Mar 22, 2023
596f0e7
Merge branch 'improve_annotation_drawing_performance' into v6_with_rm…
Azeirah Mar 22, 2023
5d5daf7
Replace exponentially scaling eraser width with linearly scaling eras…
Azeirah Apr 2, 2023
fc0f215
Force eraser to always have a color_code of white
Azeirah Apr 2, 2023
cf82278
Update rmscene to 0.4.0
Azeirah Jun 13, 2023
7a5d9c3
Handle coordinates in v6 correctly
Azeirah Jun 17, 2023
f934a29
Better support for extended pages
Azeirah Jun 17, 2023
53da976
Update rmscene to 0.3.3
Azeirah Jun 20, 2023
3a60658
Improved handling of global horizontal offsets
Azeirah Jun 25, 2023
521d3f1
Handle next_items assertion error when ReMarkable document has an unr…
Azeirah Jun 25, 2023
fbd03f6
Fix missing scale variable crash for smart highlights
Azeirah Jul 9, 2023
603ca19
update remarks
Azeirah Jul 9, 2023
9e5ebdc
Add support for PDFs with inserted pages
Azeirah Jul 23, 2023
9283e2d
Fix crash pre v6
Azeirah Jul 23, 2023
8036ae1
Remove lingering comments
Azeirah Jul 23, 2023
9f3a042
Add snapshot testing and test case for inserted pages in pdf notebook
Azeirah Jul 23, 2023
0bcda70
Add glyphrange-based highlighter
Azeirah Aug 9, 2023
13eb00b
Add black
Azeirah Aug 9, 2023
6d50256
Add missing syrupy to lock file
Azeirah Aug 9, 2023
0962cec
depend on my prod release of rmscene rather than rmscene main
Azeirah Aug 12, 2023
93abf89
httpsify link
Azeirah Aug 12, 2023
dbf8c92
Refactor logic into Document class
Azeirah Aug 13, 2023
4da4e1c
refactor: Move Document to own file
Azeirah Aug 13, 2023
f9380d7
Add Obsidian md output
Azeirah Aug 13, 2023
a236f9f
remove lingering print
Azeirah Aug 13, 2023
8fde424
Cleanup code
Azeirah Aug 13, 2023
4d6be15
Refactor: Move obsidian markdown file class to own file
Azeirah Aug 13, 2023
0b4941b
Update remarks version
Azeirah Aug 13, 2023
8edc1ac
Add warning callout
Azeirah Aug 13, 2023
4b76983
Only generate obsidian markdown files if there's content in it
Azeirah Aug 13, 2023
eaac1c4
Don't show page header if no pages have highlights
Azeirah Aug 13, 2023
b3740dc
version p
Azeirah Aug 13, 2023
8b357db
Fix pyproject rmscene dependency syntax
Azeirah Aug 23, 2023
831c436
Filter highlights that are not visible on the page
Azeirah Aug 23, 2023
01171f6
update rmscene
Azeirah Aug 23, 2023
92a6064
Update lock
Azeirah Aug 23, 2023
db7b2eb
Add updated rmscene handling an unexpected subblock
Azeirah Sep 3, 2023
05a46fa
Fix typo on homepage
Azeirah Jan 31, 2024
34952c3
Fix page index issue in obsidian output
Azeirah May 26, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
[submodule "rmscene"]
path = rmscene
url = [email protected]:ricklupton/rmscene.git
8 changes: 8 additions & 0 deletions .idea/.gitignore

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 6 additions & 0 deletions .idea/inspectionProfiles/profiles_settings.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

9 changes: 9 additions & 0 deletions .idea/misc.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 8 additions & 0 deletions .idea/modules.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

18 changes: 18 additions & 0 deletions .idea/remarks.iml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 6 additions & 0 deletions .idea/vcs.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
6 changes: 6 additions & 0 deletions conftest.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@

def pytest_exception_interact(node, call, report):
"""It would be cool to catch snapshot errors here and show an image diff viewer popup with the before and after"""
# if report.failed:
# with open('report.txt', 'w+') as f:
# f.write(report)
397 changes: 213 additions & 184 deletions poetry.lock

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion poetry.toml
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
[virtualenvs]
create = true
in-project = true
in-project = true
12 changes: 8 additions & 4 deletions pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[tool.poetry]
name = "remarks"
version = "0.3.1"
version = "0.3.10"
description = "Extract annotations (highlights and scribbles) from PDF, EPUB, and notebooks marked with reMarkable™ paper tablets. Export to Markdown, PDF, PNG, and SVG."
authors = ["lucasrla <[email protected]>"]
readme = "README.md"
Expand All @@ -16,9 +16,13 @@ classifiers = [

# https://python-poetry.org/docs/dependency-specification/
[tool.poetry.dependencies]
python = "^3.10.9"
Shapely = "^2.0.1"
PyMuPDF = "^1.21.1"
python = "^3.10"
Shapely = "^1.8.5.post1"
PyMuPDF = "1.22.5"
pytest = "^7.2.0"
rmscene = { git = "https://github.com/ricklupton/rmscene", rev = "fbab6274ed8ca29f9a9bf4fd36b6fa20cc977a1f" }
syrupy = "^4.0.8"
pyyaml = "^6.0.1"

[tool.poetry.dev-dependencies]
black = "^22.12.0"
Expand Down
120 changes: 120 additions & 0 deletions remarks/Document.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
import math
from typing import List

import fitz

from remarks.conversion import check_rm_file_version
from remarks.conversion.parsing import determine_document_dimensions
from remarks.dimensions import REMARKABLE_DOCUMENT, ReMarkableDimensions
from remarks.utils import (
get_document_filetype,
get_document_tags,
list_hl_json_files,
is_inserted_page,
get_pages_data,
list_ann_rm_files,
get_visible_name,
)


class Document:
def __init__(self, metadata_path):
self.metadata_path = metadata_path
self.pages_list, self.pages_map = get_pages_data(metadata_path)
self.doc_type = get_document_filetype(metadata_path)
self.name = get_visible_name(metadata_path)

# annotations
self.rm_tags = list(get_document_tags(metadata_path))
self.rm_annotation_files = list_ann_rm_files(metadata_path)
self.rm_highlight_files = list_hl_json_files(metadata_path)

def open_source_pdf(self) -> fitz.Document:
if self.doc_type in ["pdf", "epub"]:
f = self.metadata_path.with_name(f"{self.metadata_path.stem}.pdf")
pdf_src = fitz.open(f)

for i, page_idx in enumerate(self.pages_map):
if is_inserted_page(page_idx):
pdf_src.new_page(
width=REMARKABLE_DOCUMENT.to_mm().to_mu().width,
height=REMARKABLE_DOCUMENT.to_mm().to_mu().height,
pno=i,
)

# Thanks to @apoorvkh
# - https://github.com/lucasrla/remarks/issues/11#issuecomment-1287175782
# - https://github.com/apoorvkh/remarks/blob/64dd3b586b96195b00e727fc1f1e537b90d841dc/remarks/remarks.py#L16-L38
elif self.doc_type == "notebook":
# PyMuPDF's A4 default is width=595, height=842
# - https://pymupdf.readthedocs.io/en/latest/document.html#Document.new_page
# The 0.42 below is just me eye-balling PyMuPDF's defaults:
# 1404*0.42 ~= 590 and 1872*0.4 ~= 786
#
# reMarkable's desktop app exports notebooks to PDF with 445 x 594, in
# terms of scale it is 445/1404 = ~0.316
# Open an empty PDF to be treated as if it were the original document
pdf_src = fitz.open()
page_sizes: List[ReMarkableDimensions] = []
for page in self.pages_list:
paths = filter(
lambda _ann_page: _ann_page.stem == page, self.rm_annotation_files
)
path = next(paths, None)
if path:
try:
page_sizes.append(determine_document_dimensions(path))
except ValueError:
page_sizes.append(REMARKABLE_DOCUMENT)
else:
page_sizes.append(REMARKABLE_DOCUMENT)

# For each note page, add a blank page to the original document
for i, dims in enumerate(page_sizes):
mu_dims = dims.to_mm().to_mu()
pdf_src.new_page(
width=mu_dims.width,
height=mu_dims.height,
pno=i,
)

return pdf_src

def pages_magnitude(self):
return math.floor(math.log10(len(self.pages_list))) + 1

def pages(self):
has_annotation_highlights = False

page_uuids = set(
[f.stem for f in self.rm_annotation_files]
+ [f.stem for f in self.rm_highlight_files]
)

for page_uuid in page_uuids:
has_annotations = False
rm_annotation_file = None

rm_highlights_file = None
has_smart_highlights = False

page_idx = self.pages_list.index(f"{page_uuid}")

for f in self.rm_annotation_files:
if page_uuid == f.stem and check_rm_file_version(f):
rm_annotation_file = f
has_annotations = True

for f in self.rm_highlight_files:
if page_uuid == f.stem:
rm_highlights_file = f
has_smart_highlights = True

yield (
page_uuid,
page_idx,
rm_annotation_file,
has_annotations,
rm_highlights_file,
has_smart_highlights,
)
2 changes: 1 addition & 1 deletion remarks/conversion/__init__.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
from .parsing import (
check_rm_file_version,
parse_rm_file,
rescale_parsed_data,
get_ann_max_bound,
check_rm_file_version
)

from .drawing import (
Expand Down
Loading