Hits annotation #214
-
Hello, Thank you so much! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Hi @YiJessePi , thanks for reaching out and asking. After this workflow, Bakta has tons of information which is then utilized for the final annotation. In this process, the so called "expert annotation systems" have the highest rank (internally they comprise different sources which distinct ranks each). So if there is a hit from an expert system, then this annotation data is preferred. If this is not the case, then Bakta utilizes information from its own internal database. This DB comprises highly-integrated and merged information from a vast number of high-quality annotation sources (https://github.com/oschwengers/bakta#database). During the compilation of this DB, annotation information for each unique protein sequence and sequence cluster is superseded several times applying the most specific annotation information at last, e,g, transposases from ISfinder and NCBI. Therefore, annotations from Bakta often differ from the information UniProt provides. However, we nevertheless annotate such sequences with I hope this clarifies it a bit. Just in case, please do not hesitate to keep asking. |
Beta Was this translation helpful? Give feedback.
Hi @YiJessePi , thanks for reaching out and asking.
Yes, the annotation workflow of Bakta has become quite complex and far from trivial as there are several steps incorporating different sequence, HMM resources and annotation data. The entire workflow is described here: https://github.com/oschwengers/bakta#coding-sequences
After this workflow, Bakta has tons of information which is then utilized for the final annotation. In this process, the so called "expert annotation systems" have the highest rank (internally they comprise different sources which distinct ranks each). So if there is a hit from an expert system, then this annotation data is preferred.
If this is not the case, then Bakta…