You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
I would like to clean up annotations that are used to define more specific things. Complex patterns of variable repeating sections may contain a varying number of annotation types, hence specifying each name/result/match index is not always possible. Once the whole section of text has been annotated removal of all, or some, of these sub annotations would be good, to avoid inadvertent matching later.
Describe the solution you'd like
A RemoveAll() function where any number of annotation types or names can be specified.
This seems to work, it may be of interest to others:
class RemoveAnnAll:
"""
Action for removing annotations.
"""
def __init__(self,
name: str | List[str] = None,
type: str | List[str] = None,
annset_name: str = None,
silent_fail: bool = True):
"""
Create a remove all annotation action.
Args:
name: the name, or list of names, of a match(es) from which to get the annotation to remove
type: the annotation type, or list of types, of annotation within the whole matched pattern to remove
annset_name: the name of the annotation set to remove the annotation from. If this is the same set
as used for matching it may influence the matching result if the annotation is removed before
the remaining matching is done.
If this is not specified, the annotation set of the (first) input annotation is used.
silent_fail: if True, silently ignore the error of no annotation to get removed
"""
assert any([name, type]), \
f"either name and/or type should be provided [name: {name}, type: {type}]"
if name is not None:
assert all(isinstance(c, str) for c in name), \
f"name must be a string or list of strings but is {name}"
if isinstance(name, list):
self.name = name
else:
self.name = [name]
else:
self.name = None
if type is not None:
assert all(isinstance(c, str) for c in type), \
f"type must be a string or list of strings but is {type}"
if isinstance(type, list):
self.type = type
else:
self.type = [type]
else:
self.type = None
assert annset_name is None or isinstance(annset_name, str), \
f"annset_name must be a string or None but is {annset_name}"
self.annset_name = annset_name
self.silent_fail = silent_fail
def __call__(self, succ, context=None, location=None, annset=None):
anns_to_remove = []
for i, r in enumerate(succ._results):
if self.type is not None:
for ann in r.anns4matches():
if ann.type in self.type:
anns_to_remove.append(ann)
if self.name is not None:
for name in self.name:
for match in r.matches4name(name):
ann = match.get("ann")
anns_to_remove.append(ann)
if not anns_to_remove:
if self.silent_fail:
return
else:
raise Exception(
f"Could not find annotations of type: {self.type} and / or of name: {self.name}"
)
if self.annset_name is not None:
annset = context.doc.annset(self.annset_name)
[annset.remove(ann) for ann in anns_to_remove if ann is not None]
The text was updated successfully, but these errors were encountered:
I guess this could be useful more generally.
Just a few notes:
I would probably make anns_to_remove a set in case the same annotation could get matched in more than one pattern
I am not sure what the best way to combine match names and annotation types should be: my intiution would have been to restrict to the given selection of names and types if names or types are specified
so an annotation gets removed if it is matched by a specified name match (or no name has specified) AND it is of a type in the list (or no list of types given).
so an ann from a match not in the list, or of a type not in the list would not get removed
if no name is specified, no restriction is placed on names and anns from all matches get removed (if they match any given types)
if no types are specified, annotations of all types get removed (if they come from a match with a listed name)
if neither names nor types are specified, ALL annotations get removed.
Would such a changed semantics also be useful to you if it got added as a pre-defined action?
I think we probably also should provide more documentation for users on how to implement their own actions.
The predefiend actions are only meant to serve the most common situations anyways, as basically any python code can be run on the result matches.
Is your feature request related to a problem? Please describe.
I would like to clean up annotations that are used to define more specific things. Complex patterns of variable repeating sections may contain a varying number of annotation types, hence specifying each name/result/match index is not always possible. Once the whole section of text has been annotated removal of all, or some, of these sub annotations would be good, to avoid inadvertent matching later.
In Java I was using something like:
Describe the solution you'd like
A RemoveAll() function where any number of annotation types or names can be specified.
This seems to work, it may be of interest to others:
The text was updated successfully, but these errors were encountered: