You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
New disk-based graph storage implementation DiskPathV1_D15 that stores the
outgoing paths from every node when maximum branch-out is 1 and the longest
path has the length 15. This is an optimization especially useful for the PartOf component, since it avoids frequent disk access which would be needed
for a adjecency based implementations to get all ancestors. Also PartOf
components are not trees, but still have the property of at most 1 outgoing
edge which can be used to optimize finding all ancestors. Important You
cannot downgrade graphANNIS to an older version if you imported a disk-based
corpus with the new version, since old graphANNIS versions won't be able to
load the new graph storage implementation.
Add new global statistics that describe the combined graph. Until know, there
were only statistics for each graph component and for the node annotation
storage.
Improved handling of tok queries for corpora with tens of millions token, by
using the newly added graph storage implementation and statistics and
providing an optimized implementation for token search if we already know that
all token are part of the default ordering component. This fixes #276.
Improve performance for regular expression search when using disk-based
annotation storage and the regex has a prefix. This e.g. fixes getting the
text for a document in ANNIS when the corpus is large.
Improve performance for regular expressions that can be replaced by an exact
value search, even when the value is escaped. This can be useful e.g. in the
subgraph extraction queries from ANNIS, where some characters are escaped with \x and which was previously not treated as constant value search.
Improve performance for getting all token of a document (e.g. for a subgraph
query) when the PartOf graph storage implementation does not have the same
cost of the inverse graph storage operations by allowing to use a nested loop
join in this particular scenario.
Fixed
Do not add "annis:doc" labels to sub-corpora when importing relANNIS corpora.
This will fix queries where you just search for documents, e.g. by annis:doc
but also got the sub-corpora as result.
Re-enable adding the C-API shared library as release artifacts to GitHub.