-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error in running average_nucleotide_identity.py :( #126
Comments
Hi @Heraud04, The error message you're seeing is produced by the Biopython FASTA parser. It's complaining that there's an invalid byte (or character) in one of your input genome files. Without seeing your data files, I can't tell exactly what the problem is. Would you be able to share a minimal (i.e. small) dataset that still throws this error, along with your command-line, so that I can investigate? L. |
Hi @Heraud04 Did you find the problem in your input files, and can I close this issue? Many thanks, L. |
I have this problem. When I tested it in a small genome set, it was normal. But, when I began to compute in a big set(about 100 genomes), errors occurred. I don't know which genome caused this error. Do you need any other information? @widdowquinn |
ok, thanks |
@AlisaGU - if you do find the problematic file, I'd like to know what the "difficult" characters are. Would you be able to send me the file (or enough of the file to reproduce the issue without giving away data you don't want to share…) when you find it? |
@widdowquinn It's my pleasure. I will send it to you. |
@widdowquinn It's wierd. The program seems to run normally after I cut the genomes sets into two subsets. |
There is a solution to that? I'm running it with more than 100 of genomes. |
You can try running the Linux tool $ file OP073605.fasta
OP073605.fasta: ASCII text It will accept multiple files at once, e.g. with wild cards. |
@luigallucci Oh. You are saying the Let's try opening the files in Python directly then in the default text mode - does this work?:
Hopefully it will also give the |
@peterjc your command gives me:
but for all the files in my folder. ..while I tried that doesn't give me any highlighted line. |
@peterjc Got where the problem is. In my classes file. But I don't know where and how I can correct it. Is the one generated by |
@luigallucci if you think the problematic non-ASCII characters are coming from your classes file, then I strongly doubt you are really getting the same error as the original issue reported as issue #126, which coming from Perhaps you should open a separate issue with a clear bug report (including versions of Python, pyANI, the operating system, the exact command used, and ideally sample data). |
@luigallucci I'm assuming you solved your unicode issue with classes.txt, but have another issue now logged as #442? |
HI,
I installed pyani using conda: conda install pyani
When I use the next command: average_nucleotide_identity.py -i allgenomes/ -o anita -m ANIm -g
The following error appears:
Traceback (most recent call last):
File "/home/dennis/miniconda3/bin/average_nucleotide_identity.py", line 793, in
org_lengths = pyani_files.get_sequence_lengths(infiles)
File "/home/dennis/miniconda3/lib/python3.6/site-packages/pyani/pyani_files.py", line 53, in get_sequence_lengths
sum([len(s) for s in SeqIO.parse(fn, 'fasta')])
File "/home/dennis/miniconda3/lib/python3.6/site-packages/pyani/pyani_files.py", line 53, in
sum([len(s) for s in SeqIO.parse(fn, 'fasta')])
File "/home/dennis/.local/lib/python3.6/site-packages/Bio/SeqIO/init.py", line 637, in parse
for r in i:
File "/home/dennis/.local/lib/python3.6/site-packages/Bio/SeqIO/FastaIO.py", line 184, in FastaIterator
for title, sequence in SimpleFastaParser(handle):
File "/home/dennis/.local/lib/python3.6/site-packages/Bio/SeqIO/FastaIO.py", line 64, in SimpleFastaParser
line = handle.readline()
File "/home/dennis/miniconda3/lib/python3.6/codecs.py", line 321, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte
========================================
pyani Version: 0.2.7
Python Version: 3.6.6
Operating System: Ubuntu 18.04 LTS
Thanks for your reply in advance!
The text was updated successfully, but these errors were encountered: