This repository contains releases of models for the amrlib library. For information on how to download and install see ReadTheDocs Installation Instructions or follow the instructions below.
Note that because these models are large binary files, they are not directly tracked with git.
Instead the are provided for download as .tar.gz
files.
Download the pre-trained models from the releases
directory and extract in them in amrlib/data
. Set a link to the directory (or rename it) as either
model_stog
(sentence to graph) for the parse style models or model_gtos
(graph to sentence) for
the generate style models.
If you're unsure where amrlib is installed you can do...
pip3 show amrlib
or
python3
>>> import amrlib
>>> amrlib.__file__
All models are trained and scored on AMR-3 (LDC2020T02) using num_beams=4
for parse_xfm_x
and num_beams=5
for parse_spring
.
Note that AMR-3 is a more difficult test set than the older AMR-2 set and generally scores a bit lower for similar models. All scores are
without adding the :wiki tags. However, when using BLINK, scores typically stay approximately the same since the wikification process itself
scores in the low to mid 80s on smatch.
Speed is the inference speed on the AMR-3 test set (1898 graphs) using an RTX3090 with num_beams=1
and batch_size=32
. The units are sentences/second.
-
The parse_spring code is from the SPRING model. Note that the author's license for their code is "Attribution-NonCommercial-ShareAlike 4.0 International". Details on the model can be found in this paper.
-
The parse_gsii model comes from jcyk/AMR-gs, the details of which can be found in this paper.
-
All other models were developed as part of amrlib.
Name | Version | Date | Size | Score | ||
---|---|---|---|---|---|---|
generate_t5wtense | 0.1.0 | 2020-12-30 | 787MB | 54/44 BLEU |
The generate_t5wtense gives a 54 BLEU with tense tags or 44 BLEU with un-tagged LDC2020T02. Note that the model is only scored with graphs that fit in the T5 model's 512 token limit. If including clipped graphs, scores will be more like 52/43 BLEU. Details on using this type of model for generation can be found in this paper.
Additionally, there is a training config file for a T5-large based model in the amrlib/config directory here. This model scores about 2 BLEU points higher than generate_t5wtense-v0_1_0 if you take the time to train it yourself. The training configuration has not been full optimized so you may be able to increase this if you try different hyperparameters or possibly a different version of the base pretrained T5 model.
To report an issue with a model please use the amrlib issues list.