-
Notifications
You must be signed in to change notification settings - Fork 24
Exporters
One key element of any good information extraction pipeline is the ability to export the found information. In Eidos, there are several ways to export mentions, depending on your needs. Some of these are standalone apps, while others are implemented as Exporters
and used within an app by being specified in apps.conf
. Here are some of the more useful apps/formats.
By far my favorite two apps for general purpose needs are ExtractAndExport
and ReconstituteAndExport
. The former is used when you want to go from text to mentions, the latter when you already have mentions (as jsonld) and need to re-process and re-export them (in the same or different format).
For each of these, you can specify one or more export formats and they will magically appear. See format descriptions below.
Other apps are plentiful and useful. Keith has done a fantastic job adding READMEs that briefly explain the purpose of each. Peruse and enjoy!
There are several ways you can export mention information, and each has its own emphasis. These can be selected for usage in the apps mentioned above through the apps.conf
, by including them in apps.exportAs = [...]
.
- jsonld: the go-to export format, which is a proper serialization. This verbose output contains all the information about the mentions and the document from whence they came.
-
serialized: produces a binary file with the odin mentions serialized with java serialization. This (a) doesn't include the
EidosMention
s, which carry a lot of the metadata and (b) is prone to versioning issues as the code changes frequently, so for a serialization that is more reliable you should usejsonld
. - grounding: produces a csv file that can be used for evaluating system groundings. Note that this is likely only currently compatible with the flat groundings.
-
ground: extends the
jsonld
exporter, but prior to export grounds the mentions to the desired/specified ontology(ies). It optionally creates a debug log for information about the groundings that were produced. -
debugGrounding: produces a text log of the groundings produced by the SRLCompositionalGrounder. Less information provided than the
groundingInsight
exporter - groundingInsight: produces a verbose text log of the groundings produced by the SRLCompositionalGrounder, including the semantic roles for the sentence, etc. Used for in-depth analysis of what groundings are produced and why.