Skip to content

3.2.0 - 2024-03-13

Compare
Choose a tag to compare
@github-actions github-actions released this 13 Mar 09:20
· 129 commits to main since this release

Release Notes

Added

  • New disk-based graph storage implementation DiskPathV1_D15 that stores the
    outgoing paths from every node when maximum branch-out is 1 and the longest
    path has the length 15. This is an optimization especially useful for the
    PartOf component, since it avoids frequent disk access which would be needed
    for a adjecency based implementations to get all ancestors. Also PartOf
    components are not trees, but still have the property of at most 1 outgoing
    edge which can be used to optimize finding all ancestors. Important You
    cannot downgrade graphANNIS to an older version if you imported a disk-based
    corpus with the new version, since old graphANNIS versions won't be able to
    load the new graph storage implementation.
  • Add new global statistics that describe the combined graph. Until know, there
    were only statistics for each graph component and for the node annotation
    storage.
  • Improved handling of tok queries for corpora with tens of millions token, by
    using the newly added graph storage implementation and statistics and
    providing an optimized implementation for token search if we already know that
    all token are part of the default ordering component. This fixes #276.
  • Improve performance for regular expression search when using disk-based
    annotation storage and the regex has a prefix. This e.g. fixes getting the
    text for a document in ANNIS when the corpus is large.
  • Improve performance for regular expressions that can be replaced by an exact
    value search, even when the value is escaped. This can be useful e.g. in the
    subgraph extraction queries from ANNIS, where some characters are escaped with
    \x and which was previously not treated as constant value search.
  • Improve performance for getting all token of a document (e.g. for a subgraph
    query) when the PartOf graph storage implementation does not have the same
    cost of the inverse graph storage operations by allowing to use a nested loop
    join in this particular scenario.

Fixed

  • Do not add "annis:doc" labels to sub-corpora when importing relANNIS corpora.
    This will fix queries where you just search for documents, e.g. by annis:doc
    but also got the sub-corpora as result.
  • Re-enable adding the C-API shared library as release artifacts to GitHub.

graphannis-cli 3.2.0

Download graphannis-cli 3.2.0

File Platform Checksum
graphannis-cli-aarch64-apple-darwin.tar.xz Apple Silicon macOS checksum
graphannis-cli-x86_64-apple-darwin.tar.xz Intel macOS checksum
graphannis-cli-x86_64-pc-windows-msvc.zip x64 Windows checksum
graphannis-cli-x86_64-unknown-linux-gnu.tar.xz x64 Linux checksum

graphannis-webservice 3.2.0

Download graphannis-webservice 3.2.0

File Platform Checksum
graphannis-webservice-aarch64-apple-darwin.tar.xz Apple Silicon macOS checksum
graphannis-webservice-x86_64-apple-darwin.tar.xz Intel macOS checksum
graphannis-webservice-x86_64-pc-windows-msvc.zip x64 Windows checksum
graphannis-webservice-x86_64-unknown-linux-gnu.tar.xz x64 Linux checksum