You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As it is now, anchors extracted from documents get extracted in a flat space, while they usually exist in a tree namespace structure. This structure is described by the header level of the anchor, or the header level where the anchor exists in (if there are anchors other then headers themselfs), and all the super-headers of that header.
This is at least the case with Markdown and HTML, but probably also most other document formats.
example markdown document (doc.md):
# Top## First Sub
bla bla bla
### A Sub Sub
bli bli bli
## Second Sub
blu blu blu
### B Sub Sub
tri tra tralala
<aname="in-text"/>
This is useful when analyzing changes in documents, for example if a title has been renamed, but the structure overall has stayed the same, one might be able to generate an auto-fix for a missing link including a fragment (that is meant to map to an anchor).
The text was updated successfully, but these errors were encountered:
As it is now, anchors extracted from documents get extracted in a flat space, while they usually exist in a tree namespace structure. This structure is described by the header level of the anchor, or the header level where the anchor exists in (if there are anchors other then headers themselfs), and all the super-headers of that header.
This is at least the case with Markdown and HTML, but probably also most other document formats.
example markdown document (
doc.md
):flat extraction:
structured extraction:
Why
This is useful when analyzing changes in documents, for example if a title has been renamed, but the structure overall has stayed the same, one might be able to generate an auto-fix for a missing link including a fragment (that is meant to map to an anchor).
The text was updated successfully, but these errors were encountered: