Skip to content

Commit

Permalink
Merge branch 'main' into prompt_relation_component
Browse files Browse the repository at this point in the history
  • Loading branch information
KennethEnevoldsen authored Oct 10, 2023
2 parents a0130b0 + 8c95055 commit 892b820
Show file tree
Hide file tree
Showing 6 changed files with 48 additions and 21 deletions.
10 changes: 5 additions & 5 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,23 +7,23 @@ repos:
- id: ssort

- repo: https://github.com/asottile/add-trailing-comma
rev: v2.4.0
rev: v3.0.1
hooks:
- id: add-trailing-comma

- repo: https://github.com/PyCQA/docformatter
rev: v1.5.1
rev: v1.7.5
hooks:
- id: docformatter
args: [--in-place]

- repo: https://github.com/psf/black
rev: 23.1.0
rev: 23.7.0
hooks:
- id: black
args: [--line-length, "88"]

- repo: https://github.com/charliermarsh/ruff-pre-commit
rev: v0.0.254
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.0.284
hooks:
- id: ruff
20 changes: 20 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,26 @@

<!--next-version-placeholder-->

## v0.5.0 (2023-03-31)
### Feature
* Added gold data to package data ([`b892aaa`](https://github.com/centre-for-humanities-computing/conspiracies/commit/b892aaaa83a73e2074485ec5cd40df08d95b9c65))
* Fixed spacy dep. to allow for the visualizer ([`375b3bf`](https://github.com/centre-for-humanities-computing/conspiracies/commit/375b3bf96fa4dd1704fe2fb079541b0cbbce05f3))
* Added triplet data classes along with seraliazation functions ([`0325c63`](https://github.com/centre-for-humanities-computing/conspiracies/commit/0325c63c1c6c8dfb36c7ba21edda87ab575f6a04))

### Fix
* Fixed circular import ([`bfc5595`](https://github.com/centre-for-humanities-computing/conspiracies/commit/bfc5595ead67bdb7ac3bbb8c3fa585dc552efcec))
* Prevented circular import ([`2af3559`](https://github.com/centre-for-humanities-computing/conspiracies/commit/2af3559f337e38eb8b07b6dc04d8e264df634b72))

### Documentation
* Changed req. for docs ([`dcf2a1c`](https://github.com/centre-for-humanities-computing/conspiracies/commit/dcf2a1c8e46c4687e7e7b47a5bf8b40d393e8644))
* Updated tutorial based on reivew ([`0f29203`](https://github.com/centre-for-humanities-computing/conspiracies/commit/0f292036efa842f5f729ff31f152671ecdd8522f))
* Updated docstring based on review comments ([`b669555`](https://github.com/centre-for-humanities-computing/conspiracies/commit/b6695557426dd47a4fb10433de586d210478c47d))
* Updated tutorial to reflect changes ([`833ceae`](https://github.com/centre-for-humanities-computing/conspiracies/commit/833ceae49c83cbc44ecedd97a7cdd4262609c1a0))
* Increased execution time for tutorial ([`4fee404`](https://github.com/centre-for-humanities-computing/conspiracies/commit/4fee4042d6a7704b426b6f563aeee51b0b79ae57))
* Updated notebook ([`330fc22`](https://github.com/centre-for-humanities-computing/conspiracies/commit/330fc22efba02f05958cd51993ff404f14ba5036))
* Updated to properly read data ([`3e3e62d`](https://github.com/centre-for-humanities-computing/conspiracies/commit/3e3e62d7ed22b44492661447cb00f7c516893e9d))
* Added reading gold as an example ([`b6d83b6`](https://github.com/centre-for-humanities-computing/conspiracies/commit/b6d83b61119c0f644abdd7659396f552cb7ffded))

## v0.4.0 (2023-03-10)
### Feature
* Add meaningful default task descriptions for templates ([#20](https://github.com/centre-for-humanities-computing/conspiracies/issues/20)) ([`17436ee`](https://github.com/centre-for-humanities-computing/conspiracies/commit/17436eefd5a0bb83f223cb8f67c6db44105ffdda))
Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[project]
name = "conspiracies"
version = "0.4.0"
version = "0.5.0"
authors = [
{name = '"Kenneth Enevoldsen'},
{name = "Lasse Hansen"},
Expand Down
18 changes: 12 additions & 6 deletions src/conspiracies/relationextraction/model.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,10 @@ def __init__(self, arg_layer, n_layers):
predicates. It uses ArgExtractorLayer as a base block and repeat the
block N('n_layers') times.
:param arg_layer: an instance of the ArgExtractorLayer() class (required)
:param n_layers: the number of sub-layers in the ArgModule (required).
:param arg_layer: an instance of the ArgExtractorLayer() class
(required)
:param n_layers: the number of sub-layers in the ArgModule
(required).
"""
super(ArgModule, self).__init__()
self.layers = _get_clones(arg_layer, n_layers)
Expand Down Expand Up @@ -66,11 +68,14 @@ def __init__(
attention. (only encoder-decoder multi-head attention followed by feed-
forward layers)
:param d_model: model dimensionality (default=768 from BERT-base)
:param d_model: model dimensionality (default=768 from BERT-
base)
:param n_heads: number of heads in multi-head attention layer
:param d_feedforward: dimensionality of point-wise feed-forward layer
:param d_feedforward: dimensionality of point-wise feed-forward
layer
:param dropout: drop rate of all layers
:param activation: activation function after first feed-forward layer
:param activation: activation function after first feed-forward
layer
"""
super(ArgExtractorLayer, self).__init__()
self.multihead_attn = nn.MultiheadAttention(d_model, n_heads, dropout=dropout)
Expand All @@ -89,7 +94,8 @@ def forward(self, target, source, key_mask=None):
:param target: a tensor which takes a role as a query
:param source: a tensor which takes a role as a key & value
:param key_mask: key mask tensor with the shape of (batch_size, sequence_length)
:param key_mask: key mask tensor with the shape of (batch_size,
sequence_length)
"""
# Multi-head attention layer (+ add & norm)
attended = self.multihead_attn(
Expand Down
17 changes: 9 additions & 8 deletions src/conspiracies/relationextraction/other/bio.py
Original file line number Diff line number Diff line change
Expand Up @@ -76,10 +76,11 @@ def filter_arg_tags(arg_tags, pred_tags, tokens):
"""Same as the description of @filter_pred_tags().
:param arg_tags: argument tags with the shape of (B, L).
:param pred_tags: predicate tags with the same shape.
It is used to force predicate position to be allocated the 'Outside' tag.
:param tokens: list of string tokens with the length of L.
It is used to force special tokens like [CLS] to be allocated the 'Outside' tag.
:param pred_tags: predicate tags with the same shape. It is used to
force predicate position to be allocated the 'Outside' tag.
:param tokens: list of string tokens with the length of L. It is
used to force special tokens like [CLS] to be allocated the
'Outside' tag.
:return: tensor of filtered argument tags with the same shape.
"""
# filter by tokens ([CLS], [SEP], [PAD] tokens should be allocated as 'O')
Expand Down Expand Up @@ -120,9 +121,9 @@ def get_max_prob_args(arg_tags, arg_probs):
labels.
:param arg_tags: argument tags with the shape of (B, L).
:param arg_probs: argument softmax probabilities with the shape of (B, L, T),
where B is the batch size, L is the sequence length, and T is the # of tag
labels.
:param arg_probs: argument softmax probabilities with the shape of
(B, L, T), where B is the batch size, L is the sequence length,
and T is the # of tag labels.
:return: tensor of filtered argument tags with the same shape.
"""
for cur_arg_tag, cur_probs in zip(arg_tags, arg_probs):
Expand Down Expand Up @@ -292,7 +293,7 @@ def _find_begins(idxs):


def get_confidence_score(pred_probs, arg_probs, extraction_idxs):
"""get the confidence score of each extraction for drawing PR-curve.
"""Get the confidence score of each extraction for drawing PR-curve.
:param pred_probs: (sequence length, # of predicate labels)
:param arg_probs: (# of predicates, sequence length, # of argument labels)
Expand Down
2 changes: 1 addition & 1 deletion src/conspiracies/relationextraction/wrap_model_spacy.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@


class SpacyRelationExtractor(TrainablePipe):
"""spaCy pipeline component that adds a multilingual relation-extraction
"""SpaCy pipeline component that adds a multilingual relation-extraction
component. The extractions are saved in the doc._.relation_triplets,
._.relation_head,
Expand Down

0 comments on commit 892b820

Please sign in to comment.