-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[challenge] Aspergillus terpenoids #8
Comments
In general, yaccl is taxon-agnostic. What we do is take a list of compounds and make sure they are classified correctly. In this case we don't have access to the paper, and the paper is also not in Wikidata. But the following query gets all InChI strings of compounds from Aspergillus species:
Saving the list of 4,349 compounds in a file |
I may have misnderstood. Did you mean to extract all terpenoids from the list of 4,349 compounds? That should be doable. |
Sorry, my question was badly formulated. I did not really want to reproduce what is in the article, rather generate the WD+yaccl equivalent. Rather...yes...you were faster |
Here is a slightly adapted query: |
I have pushed a new version of the classify script such that JSON output also includes the molecule. This would allow more comfortable processing of the output of my small bash script given above. |
For the sake of speed there should be better handling when both -j and -t are given. Noted. |
Beautiful. Thanks! |
Alternatively, if you are satisfied with what is already in WD, going without yaccl should work too: |
Pushed the addition of InChI key too... |
https://w.wiki/4T9n Wooops... forgot to filter: |
These are only the subclasses, you need to include P31/P279* in order to get all. Either with UNION or using the pipe symbol. |
Oh, indeed nice catch |
I knew because the yaccl run found 360, as well. Now, for the interpretation, the subclasses might contain duplicates where the stereochemistry is unspecified. |
This one is interesting https://www.wikidata.org/wiki/Q77573987 |
Well...this one is in the end a real challenge! x) I am not sure a lot of humans would do better |
So, you see, I nearly always add P31/P279 to WD compounds at the same time I add SMARTS to classes. Exceptions: I still need to add P31/P279 for unspecified alkaloids and macrolides in WD. Having it all in WD simplifies searches as this one. The downside is that as followup the WD entries need to be maintained, e.g. by frequent scanning. This issue also demands improvements in yaccl/WD integration. I'll leave it open until I think it is resolved. |
Hi again!
Small question in the form of a challenge:
Would yaccl be able to perform a query that allows reproducing the listed compounds in https://doi.org/10.1016/j.phytochem.2021.113011 (without the need of npclassifier or classyfire)?
As starting point those two existing queries might help:
https://w.wiki/4ShY
https://w.wiki/3HMD
Best,
The text was updated successfully, but these errors were encountered: