Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Phage genome annotations for GenBank submission #76

Open
vdruelle opened this issue Oct 31, 2024 · 0 comments
Open

Phage genome annotations for GenBank submission #76

vdruelle opened this issue Oct 31, 2024 · 0 comments

Comments

@vdruelle
Copy link

Hi @gbouras13,

Thanks for creating Phold, it's a great tool !
I'm writing this to suggest some changes in the output of Phold (or Pharokka ?) to simplify the process of uploading annotated phage genomes to Genbank.

To give you a bit of perspective, we isolated a new phage that we characterised and want to upload on GenBank. We illumina sequenced it, assembled it with Unicycler, annotated and rotated it with Pharokka and then annotated with Phold. We then used the .gbk file generated by Phold with the tool https://chlorobox.mpimp-golm.mpg.de/GenBank2Sequin.html to generate the .fasta and .tbl file necessary for the submission via BankIt.

Unfortunately some of the annotations generated were not in a format that was accepted by BankIt. The main issues were that:

  • the \tRNA field of the tRNA features was not recognised. Replacing \tRNA with \product made it work. It seems \product is the field used in other published phage genomes I checked.
  • the \anticodon field of the tRNA features is in incorrect format. Right now it looks like /anticodon=TAA while it's supposed to look something like /anticodon=(pos:678..680,aa:Leu,seq:taa). For now I made it work by just removing this \anticodon field.

My suggestion would be to fix the formatting of the tRNA features so that it's accepted for GenBank submission without the need for manual correction of these features. I assume this would simplify the usage of Phold to annotate phages for GenBank submissions. Additionally one could add the feature .tbl file needed for submission as an output of Phold to avoid relying on a tool like https://chlorobox.mpimp-golm.mpg.de/GenBank2Sequin.html to create the feature .tbl from the .gbk file.

I tried looking at how Pharokka does that to propose a pull request with the changes needed, but I know too little about this tool to figure out how to do that...

Let me know in case you need more details and thank you again for your work,
Valentin


  • phold version: 0.2.0
  • pharokka version: 1.7.3
  • Python version: 3.11.9
  • Operating System: Ubuntu 20.04 LTS

Problematic .gbk feature:
tRNA 140628..140713
/ID="DSQLZVHB_tRNA_0001"
/transl_table=11
/trna="tRNA-Leu(TAA)"
/isotype="Leu"
/anticodon=TAA
/locus_tag="DSQLZVHB_tRNA_0001"
/source="tRNAscan-SE_2.0.12"

Corrected to be accepted by GenBank:
tRNA 140628..140713
/ID="DSQLZVHB_tRNA_0001"
/transl_table=11
/product="tRNA-Leu(TAA)"
/isotype="Leu"
/locus_tag="DSQLZVHB_tRNA_0001"
/source="tRNAscan-SE_2.0.12"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant