Skip to content

Commit

Permalink
fix: support context_words=0 in span_context_getter
Browse files Browse the repository at this point in the history
  • Loading branch information
percevalw committed Jul 19, 2024
1 parent 1174c58 commit 54a41cd
Show file tree
Hide file tree
Showing 2 changed files with 7 additions and 8 deletions.
2 changes: 1 addition & 1 deletion changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@
- Support mixed precision in `eds.text_cnn` and `eds.ner_crf` components
- Support pre-quantization (<4.30) transformers versions
- Verify that all batches are non empty
- Fix `span_context_getter` for `context_sents` > 2 and support assymetric contexts
- Fix `span_context_getter` for `context_words` = 0, `context_sents` > 2 and support assymetric contexts

## v0.12.3

Expand Down
13 changes: 6 additions & 7 deletions edsnlp/utils/span_getters.py
Original file line number Diff line number Diff line change
Expand Up @@ -307,11 +307,13 @@ def __call__(self, span: Union[Doc, Span]) -> Union[Span, List[Span]]:
n_words_left = self.n_words_left
n_words_right = self.n_words_right

start = span.start - n_words_left
end = span.end + n_words_right
start = max(0, span.start - n_words_left)
end = min(len(span.doc), span.end + n_words_right)

n_sents_max = max(n_sents_left, n_sents_right)
if n_sents_max > 0:
min_start_sent = start
max_end_sent = end
if n_sents_left == 1:
sent = span.sent
min_start_sent = sent.start
Expand All @@ -325,10 +327,7 @@ def __call__(self, span: Union[Doc, Span]) -> Union[Span, List[Span]]:
max_end_sent = sents[
min(len(sents) - 1, sent_i + n_sents_right - 1)
].end
start = max(0, min(start, min_start_sent))
end = min(len(span.doc), max(end, max_end_sent))
else:
start = max(0, start)
end = min(len(span.doc), end)
start = min(start, min_start_sent)
end = max(end, max_end_sent)

return span.doc[start:end]

0 comments on commit 54a41cd

Please sign in to comment.