Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pplacer/NewickError #1

Open
asierFernandezP opened this issue Jun 20, 2024 · 4 comments
Open

pplacer/NewickError #1

asierFernandezP opened this issue Jun 20, 2024 · 4 comments

Comments

@asierFernandezP
Copy link

asierFernandezP commented Jun 20, 2024

Hi!

Thanks a lot for developing this useful tool! I am currently trying to run it on a set of ~2,000 viral genomes.
I followed the GitHub instructions and:

  • I also added prodigal to the path (otherwise I get an error)
  • I had to install ete3 module to avoid 'No module found error'

However, I am now getting the following error:

Strategy:
FFT-NS-i (Standard)
Iterative refinement method (max. 2 iterations)

If unsure which option to use, try 'mafft --auto input > output'.
For more information, see 'mafft --help', 'mafft --man' and the mafft page.

The default gap scoring scheme has been changed in version 7.110 (2013 Oct).
It tends to insert more gaps into gap-rich regions than previous versions.
To disable this change, add the --leavegappyregion option.

rm: cannot remove '../query_viral_genomes_protein.faa.tmp': No such file or directory
Thu Jun 20 22:54:55 CEST 2024 Replace genomes in reference trees
Running pplacer v1.1.alpha19-0-g807f6f3 analysis on Ackermannviridae_ReferenceQuery_aln.fasta...
Found reference sequences in given alignment file. Using those for reference alignment.
Pre-masking sequences... sequence length cut from 48888 to 0.
Sequence length cut to 0 by pre-masking; can't proceed with no information.
guppy: loadlocale.c:129: _nl_intern_locale_data: Assertion `cnt < (sizeof (_nl_value_type_LC_TIME) / sizeof (_nl_value_type_LC_TIME[0]))' failed.
/scratch/hb-llnext/conda_envs/vClassifier/vClassifier_family: line 314: 2631870 Aborted (core dumped) guppy tog -o "$line"_ReferenceQuery.jplace.treefile "$line"_ReferenceQuery.jplace
Traceback (most recent call last):
File "/scratch/hb-llnext/conda_envs/vClassifier/scripts/identify_monophyletic_groups.py", line 12, in
tree = Tree(args.tree_file)
^^^^^^^^^^^^^^^^^^^^
File "/home2/p304845/.local/lib/python3.11/site-packages/ete3/coretype/tree.py", line 212, in init
read_newick(newick, root_node = self, format=format,
File "/home2/p304845/.local/lib/python3.11/site-packages/ete3/parser/newick.py", line 264, in read_newick
raise NewickError('Unexisting tree file or Malformed newick tree structure.')
ete3.parser.newick.NewickError: Unexisting tree file or Malformed newick tree structure.
You may want to check other newick loading flags like 'format' or 'quoted_node_names'.

@asierFernandezP asierFernandezP changed the title NewickError pplacer/NewickError Jun 21, 2024
@KunUW
Copy link
Collaborator

KunUW commented Oct 25, 2024

Hi,

Apologize for the delay in our response. It seems that there was an issue with MAFFT and other scripts. ​We have updated the scripts for vClassifier, and it is now performing well.​ Could you please try the updated version? If you encounter any further issues, please do not hesitate to let us know. Thanks.

@asierFernandezP
Copy link
Author

asierFernandezP commented Oct 25, 2024

Thanks for your answer!

I reinstalled the environment and tried running it again but I still encounter a similar problem:

====================================================================================================
vie oct 25 07:55:21 PDT 2024	Step 1: Gene calling and VOG annotation
vie oct 25 08:18:22 PDT 2024	Step 2: Identification of single-copy markers
readline() on closed filehandle IN2 at /clusterfs/jgi/scratch/science/metagen/afernandezpato/Tools/vClassifier/vClassifier/scripts/rawSeqID2queries.pl line 22.
readline() on closed filehandle IN2 at /clusterfs/jgi/scratch/science/metagen/afernandezpato/Tools/vClassifier/vClassifier/scripts/rawSeqID2queries.pl line 22.
readline() on closed filehandle IN2 at /clusterfs/jgi/scratch/science/metagen/afernandezpato/Tools/vClassifier/vClassifier/scripts/rawSeqID2queries.pl line 22.
readline() on closed filehandle IN2 at /clusterfs/jgi/scratch/science/metagen/afernandezpato/Tools/vClassifier/vClassifier/scripts/rawSeqID2queries.pl line 22.
readline() on closed filehandle IN2 at /clusterfs/jgi/scratch/science/metagen/afernandezpato/Tools/vClassifier/vClassifier/scripts/rawSeqID2queries.pl line 22.
readline() on closed filehandle IN2 at /clusterfs/jgi/scratch/science/metagen/afernandezpato/Tools/vClassifier/vClassifier/scripts/rawSeqID2queries.pl line 22.
readline() on closed filehandle IN2 at /clusterfs/jgi/scratch/science/metagen/afernandezpato/Tools/vClassifier/vClassifier/scripts/rawSeqID2queries.pl line 22.
readline() on closed filehandle IN2 at /clusterfs/jgi/scratch/science/metagen/afernandezpato/Tools/vClassifier/vClassifier/scripts/rawSeqID2queries.pl line 22.
readline() on closed filehandle IN2 at /clusterfs/jgi/scratch/science/metagen/afernandezpato/Tools/vClassifier/vClassifier/scripts/rawSeqID2queries.pl line 22.
readline() on closed filehandle IN2 at /clusterfs/jgi/scratch/science/metagen/afernandezpato/Tools/vClassifier/vClassifier/scripts/rawSeqID2queries.pl line 22.
readline() on closed filehandle IN2 at /clusterfs/jgi/scratch/science/metagen/afernandezpato/Tools/vClassifier/vClassifier/scripts/rawSeqID2queries.pl line 22.
readline() on closed filehandle IN2 at /clusterfs/jgi/scratch/science/metagen/afernandezpato/Tools/vClassifier/vClassifier/scripts/rawSeqID2queries.pl line 22.
readline() on closed filehandle IN2 at /clusterfs/jgi/scratch/science/metagen/afernandezpato/Tools/vClassifier/vClassifier/scripts/rawSeqID2queries.pl line 22.
readline() on closed filehandle IN2 at /clusterfs/jgi/scratch/science/metagen/afernandezpato/Tools/vClassifier/vClassifier/scripts/rawSeqID2queries.pl line 22.
readline() on closed filehandle IN2 at /clusterfs/jgi/scratch/science/metagen/afernandezpato/Tools/vClassifier/vClassifier/scripts/rawSeqID2queries.pl line 22.
readline() on closed filehandle IN2 at /clusterfs/jgi/scratch/science/metagen/afernandezpato/Tools/vClassifier/vClassifier/scripts/rawSeqID2queries.pl line 22.
readline() on closed filehandle IN2 at /clusterfs/jgi/scratch/science/metagen/afernandezpato/Tools/vClassifier/vClassifier/scripts/rawSeqID2queries.pl line 22.
readline() on closed filehandle IN2 at /clusterfs/jgi/scratch/science/metagen/afernandezpato/Tools/vClassifier/vClassifier/scripts/rawSeqID2queries.pl line 22.
readline() on closed filehandle IN2 at /clusterfs/jgi/scratch/science/metagen/afernandezpato/Tools/vClassifier/vClassifier/scripts/rawSeqID2queries.pl line 22.
readline() on closed filehandle IN2 at /clusterfs/jgi/scratch/science/metagen/afernandezpato/Tools/vClassifier/vClassifier/scripts/rawSeqID2queries.pl line 22.
readline() on closed filehandle IN2 at /clusterfs/jgi/scratch/science/metagen/afernandezpato/Tools/vClassifier/vClassifier/scripts/rawSeqID2queries.pl line 22.
readline() on closed filehandle IN2 at /clusterfs/jgi/scratch/science/metagen/afernandezpato/Tools/vClassifier/vClassifier/scripts/rawSeqID2queries.pl line 22.
readline() on closed filehandle IN2 at /clusterfs/jgi/scratch/science/metagen/afernandezpato/Tools/vClassifier/vClassifier/scripts/rawSeqID2queries.pl line 22.
readline() on closed filehandle IN2 at /clusterfs/jgi/scratch/science/metagen/afernandezpato/Tools/vClassifier/vClassifier/scripts/rawSeqID2queries.pl line 22.
readline() on closed filehandle IN2 at /clusterfs/jgi/scratch/science/metagen/afernandezpato/Tools/vClassifier/vClassifier/scripts/rawSeqID2queries.pl line 22.
readline() on closed filehandle IN2 at /clusterfs/jgi/scratch/science/metagen/afernandezpato/Tools/vClassifier/vClassifier/scripts/rawSeqID2queries.pl line 22.
readline() on closed filehandle IN2 at /clusterfs/jgi/scratch/science/metagen/afernandezpato/Tools/vClassifier/vClassifier/scripts/rawSeqID2queries.pl line 22.
readline() on closed filehandle IN2 at /clusterfs/jgi/scratch/science/metagen/afernandezpato/Tools/vClassifier/vClassifier/scripts/rawSeqID2queries.pl line 22.
readline() on closed filehandle IN2 at /clusterfs/jgi/scratch/science/metagen/afernandezpato/Tools/vClassifier/vClassifier/scripts/rawSeqID2queries.pl line 22.
readline() on closed filehandle IN2 at /clusterfs/jgi/scratch/science/metagen/afernandezpato/Tools/vClassifier/vClassifier/scripts/rawSeqID2queries.pl line 22.
readline() on closed filehandle IN2 at /clusterfs/jgi/scratch/science/metagen/afernandezpato/Tools/vClassifier/vClassifier/scripts/rawSeqID2queries.pl line 22.
readline() on closed filehandle IN2 at /clusterfs/jgi/scratch/science/metagen/afernandezpato/Tools/vClassifier/vClassifier/scripts/rawSeqID2queries.pl line 22.
readline() on closed filehandle IN2 at /clusterfs/jgi/scratch/science/metagen/afernandezpato/Tools/vClassifier/vClassifier/scripts/rawSeqID2queries.pl line 22.
readline() on closed filehandle IN2 at /clusterfs/jgi/scratch/science/metagen/afernandezpato/Tools/vClassifier/vClassifier/scripts/rawSeqID2queries.pl line 22.
readline() on closed filehandle IN2 at /clusterfs/jgi/scratch/science/metagen/afernandezpato/Tools/vClassifier/vClassifier/scripts/rawSeqID2queries.pl line 22.
readline() on closed filehandle IN2 at /clusterfs/jgi/scratch/science/metagen/afernandezpato/Tools/vClassifier/vClassifier/scripts/rawSeqID2queries.pl line 22.
vie oct 25 08:18:26 PDT 2024	Step 3: Genome replacement in reference trees
mv: cannot stat '*_ReferenceQuery_aln.fasta': No such file or directory
vie oct 25 08:18:26 PDT 2024	Preprocessing before classification for viruses in *
pplacer: loadlocale.c:129: _nl_intern_locale_data: Assertion `cnt < (sizeof (_nl_value_type_LC_TIME) / sizeof (_nl_value_type_LC_TIME[0]))' failed.
/clusterfs/jgi/scratch/science/metagen/afernandezpato/Tools/vClassifier/vClassifier/scripts/vClassifier_family.sh: line 132: 2138183 Aborted                 (core dumped) pplacer --verbosity 0 -c $installer_dir/database/packages_for_pplacer/"$line".refpkg "$line"_ReferenceQuery_aln.fasta -o "$line"_ReferenceQuery.jplace
guppy: loadlocale.c:129: _nl_intern_locale_data: Assertion `cnt < (sizeof (_nl_value_type_LC_TIME) / sizeof (_nl_value_type_LC_TIME[0]))' failed.
/clusterfs/jgi/scratch/science/metagen/afernandezpato/Tools/vClassifier/vClassifier/scripts/vClassifier_family.sh: line 132: 2138186 Aborted                 (core dumped) guppy tog -o "$line"_ReferenceQuery.jplace.treefile "$line"_ReferenceQuery.jplace
Traceback (most recent call last):
  File "/clusterfs/jgi/scratch/science/metagen/afernandezpato/Tools/vClassifier/vClassifier/scripts/identify_monophyletic_groups.py", line 12, in <module>
    tree = Tree(args.tree_file)
  File "/clusterfs/jgi/scratch/science/metagen/afernandezpato/Tools/vClassifier/lib/python3.6/site-packages/ete3/coretype/tree.py", line 211, in __init__
    quoted_names=quoted_node_names)
  File "/clusterfs/jgi/scratch/science/metagen/afernandezpato/Tools/vClassifier/lib/python3.6/site-packages/ete3/parser/newick.py", line 249, in read_newick
    raise NewickError('Unexisting tree file or Malformed newick tree structure.')
ete3.parser.newick.NewickError: Unexisting tree file or Malformed newick tree structure.
You may want to check other newick loading flags like 'format' or 'quoted_node_names'.
vie oct 25 08:23:09 PDT 2024	Step 4: Classification at subfamily rank
vie oct 25 08:23:09 PDT 2024	Step 5: Classification at genus rank
vie oct 25 08:23:09 PDT 2024	Step 6: Classification at species rank
cat: '*_fastani_output_species_classification2.besthit': No such file or directory
cat: '*monophyletic_groups_with_seqID_for_subfamily_assignment_output': No such file or directory
cat: '*monophyletic_groups_with_seqID_for_genus_assignment_output': No such file or directory
Use of uninitialized value $col2[0] in hash element at /clusterfs/jgi/scratch/science/metagen/afernandezpato/Tools/vClassifier/vClassifier/scripts/queries2rawSeqID.pl line 24, <IN2> line 1.
vie oct 25 08:25:32 PDT 2024	Step 7: Final lineage assignment
vie oct 25 08:25:32 PDT 2024	Assignment finished
====================================================================================================
====================================================================================================

@KunUW
Copy link
Collaborator

KunUW commented Oct 26, 2024

Have you successfully tested the example sequences? If so, it is likely that your own query genomes cannot be classified by vClassifier. This may be due to the fact that your query genomes fall outside the 36 families and 55 subfamilies identified in our paper. Alternatively, it could be that, although your genomes belong to these families or subfamilies, no single copy genes were detected. ​Nevertheless, we anticipate releasing a more robust version of vClassifier in the future, which will cover a wider range of families and subfamilies.

@asierFernandezP
Copy link
Author

asierFernandezP commented Oct 28, 2024

Thanks for the answer! These are mostly phages belonging to the Caudoviricetes class (identified from human gut samples). I will anyway wait for the final version :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants