-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automatically generate text definitions from logical definitions #2349
Conversation
This commit uses the FlyBase ROBOT plugin to automatically generate text definitions for terms that do not have any (or that have a "." definition). The generated definition is derived from the term's logical definition.
Here are the definitions that would be generated (given the current state of the `-edit` file) by this PR:
|
The generated definition for CL:0000611 is particularly ugly, but that’s only because the logical definition of that term is itself ugly (and of dubious usefulness in my opinion). |
@aleixpuigb @JABelfiore @AvolaAmg @Caroline-99 - I'd like your comments on the autodefs in Damien's table here: #2349 (comment) Do you think they are an acceptable way to get better definition coverage? - perhaps as an interim step before review? |
I would suggest simply defining CL:0000611 eosinophil progenitor cell in an analogous way to CL:0000834 neutrophil progenitor cell: "A progenitor cell of the neutrophil lineage." Thus, "A progenitor cell of the eosinophil lineage." The logical definition captures a lot of marker detail used to identify this cell type uniquely in flow cytometry by excluding so-called lineage markers, to exclude other leukocyte subsets. I agree it is perhaps a bit ornate, but I would prefer to leave until someone has a look at the all the granulocytes in their immature and mature forms. |
The |
Aside of CL:0000611, it is a nice way to have a temporary definition until a more detailed one is added. Is there an easy way to not include the differentia ID? I don't think it is a big problem, but if they can be removed easily, it looks better in my opinion. |
It requires some changes in the code of the |
This commit updates the FlyBase ROBOT plugin to its latest version so that we can use the newly introduced `--no-ids` option, to prevent the insertion of term IDs within auto-generated definitions.
It’s done. |
Except for the differentia ID, is there a way to flag the fact that these were automatically generated definitions? |
@AvolaAmg We just lack a standardised way of doing so. In FlyBase, we annotate automatically generated definitions with the pseudo-CURIE Ideally we would have a dedicated annotation property (not |
I found this term in Uberon with an automated definition https://www.ebi.ac.uk/ols4/ontologies/uberon/classes/http%253A%252F%252Fpurl.obolibrary.org%252Fobo%252FUBERON_0005055 |
Not really keen on embedding the “auto-generated definition” bit directly within the definition itself. Could do as a stop-gap measure until we come up with a proper, dedicated annotation (I’ve asked for one here), though… |
IIRC we are also not using any label for auto-generated definitions in DOSDP. |
We also add a So if we already have auto-generated definitions that are currently not flagged as such, I’d be inclined to do nothing on that front for now (that is, not try to flag the definitions derived from logical axioms as being auto-generated). If/when the OBO community agrees on a standard way to flag such auto-generated content, we can then update both our DOSDP patterns and the the mechanism used in this PR to add the proper annotation. |
I'd really like them flagged. I don't want to let arguments about semantics get in the way of doing that. |
I like the FBC:Autogenerated for now until a more general solution can be found! |
OK, I’ll add an option to the plugin to allow specifying an arbitrary annotation to add to the newly generated definitions: (I am concerned this will turn out to be one of those “temporary solutions” that will stay around for years without ever being replaced by a proper annotation, but I’ve made my point about wanting a standardised solution, no need to elaborate.) |
Update the preprocessing step to annotate all automatically generated definitions with a cross-referencewith the special value "FBC:Autogenerated".
Any other objection or thing that should be changed before we merge here? The updated PR injects definitions that do not include the differentia ID and that are flagged with a Anything else? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally dont really think we need to include any ids in the generated definitions, but this solution here is a 10-fold improvement over the status quo, so APPROVE, and iterate!
Good, because we do not do that. :) The initial version of the PR did (that’s the default behaviour of the |
I tried to find an updated version of #2349 (comment), I thought that was the latest state! Sorry. No worries. All is good! |
This PR exploits the ODK preprocessing step to automatically inject text definitions for terms that are lacking one, using the
rewrite-def
command of FlyBase’s ROBOT plugin. That command, as used here, find terms without a text definition (or terms with a definition consisting only of a single dot) and, if they have a logical definition, automatically translate the logical definition to a human-readable definition.Related to #2342