You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
So, I got error while try to running isovar (see below):
isovar-protein-sequences.py --vcf batch1-ensemble-annotated-decomposed.vcf --bam ERR420397.fastq.genome.sorted.bam --min-alt-rna-reads 2 --protein-sequence-length 30 --output rep1-results.csv --genome NCBIM37
2018-03-08 21:59:24,270 - isovar.allele_reads:200 - INFO - Gathering reads for Variant(contig='1', start=4336591, ref='', alt='T', reference_name='NCBIM37')
2018-03-08 21:59:24,289 - isovar.allele_reads:208 - INFO - Gathering variant reads for variant Variant(contig='1', start=4336591, ref='', alt='T', reference_name='NCBIM37') (chromosome = 1, gene names = ['Rp1'])
2018-03-08 21:59:24,303 - isovar.locus_reads:314 - INFO - Found 0 reads overlapping locus 1: 4336591-4336592
2018-03-08 21:59:24,303 - isovar.translation:461 - INFO - No supporting reads for variant Variant(contig='1', start=4336591, ref='', alt='T', reference_name='NCBIM37')
2018-03-08 21:59:24,303 - isovar.allele_reads:200 - INFO - Gathering reads for Variant(contig='1', start=5088129, ref='A', alt='C', reference_name='NCBIM37')
2018-03-08 21:59:24,305 - isovar.allele_reads:208 - INFO - Gathering variant reads for variant Variant(contig='1', start=5088129, ref='A', alt='C', reference_name='NCBIM37') (chromosome = 1, gene names = ['Atp6v1h'])
2018-03-08 21:59:24,318 - isovar.locus_reads:314 - INFO - Found 22 reads overlapping locus 1: 5088128-5088130
2018-03-08 21:59:24,318 - isovar.variant_sequences:422 - INFO - Initial pool of 2 variant sequences (min length=50, max length=50)
2018-03-08 21:59:24,318 - isovar.assembly:140 - INFO - Collapsed 2 -> 2 sequences
2018-03-08 21:59:24,318 - isovar.variant_sequences:440 - INFO - After overlap assembly: 1 variant sequences (min length=56, max length=56)
2018-03-08 21:59:24,320 - isovar.variant_sequences:207 - INFO - Coverage: [1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2 2 2 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1] (len=56)
2018-03-08 21:59:24,320 - isovar.variant_sequences:330 - INFO - Kept 1/1 variant sequences after read coverage trimming to >=2x
2018-03-08 21:59:24,320 - isovar.variant_sequences:456 - INFO - After coverage & length filtering: 1 variant sequences (min length=44, max length=44)
2018-03-08 21:59:24,636 - pyensembl.sequence_data:118 - INFO - Loaded sequence dictionary from /home/zhuwe/.cache/pyensembl/NCBIM37/ensembl67/Mus_musculus.NCBIM37.67.cdna.all.fa.gz.pickle
2018-03-08 21:59:24,653 - pyensembl.sequence_data:118 - INFO - Loaded sequence dictionary from /home/zhuwe/.cache/pyensembl/NCBIM37/ensembl67/Mus_musculus.NCBIM37.67.ncrna.fa.gz.pickle
2018-03-08 21:59:24,666 - pyensembl.database:436 - WARNING - Encountered error "no such column: exon_id" from query "
SELECT exon_number, exon_id
FROM exon
WHERE transcript_id = ?
" with parameters ['ENSMUST00000044369']
Traceback (most recent call last):
File "/bioinfo/packages/anaconda3/lib/python3.6/site-packages/pyensembl/common.py", line 65, in wrapped_fn
return cache[cache_key]
KeyError: (<pyensembl.database.Database object at 0x7fffec08ba20>, ('exon_number', 'exon_id'), ('feature', 'exon'), ('filter_column', 'transcript_id'), ('filter_value', 'ENSMUST00000044369'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/bioinfo/packages/anaconda3/bin/isovar-protein-sequences.py", line 46, in
df = protein_sequences_dataframe_from_args(args)
File "/bioinfo/packages/anaconda3/lib/python3.6/site-packages/isovar/cli/protein_sequences.py", line 73, in protein_sequences_dataframe_from_args
return protein_sequences_generator_to_dataframe(protein_sequences_generator)
File "/bioinfo/packages/anaconda3/lib/python3.6/site-packages/isovar/protein_sequences.py", line 323, in protein_sequences_generator_to_dataframe
gene=lambda x: ";".join(x)))
File "/bioinfo/packages/anaconda3/lib/python3.6/site-packages/isovar/dataframe_builder.py", line 189, in dataframe_from_generator
for variant, elements in variant_and_elements_generator:
File "/bioinfo/packages/anaconda3/lib/python3.6/site-packages/isovar/protein_sequences.py", line 279, in reads_generator_to_protein_sequences_generator
variant_sequence_assembly=variant_sequence_assembly)
File "/bioinfo/packages/anaconda3/lib/python3.6/site-packages/isovar/translation.py", line 492, in translate_variant_reads
transcript_id_whitelist=transcript_id_whitelist)
File "/bioinfo/packages/anaconda3/lib/python3.6/site-packages/isovar/reference_context.py", line 117, in reference_contexts_for_variant
transcript_id_whitelist=transcript_id_whitelist)
File "/bioinfo/packages/anaconda3/lib/python3.6/site-packages/isovar/effect_prediction.py", line 95, in reference_transcripts_for_variant
transcript_id_whitelist=transcript_id_whitelist)
File "/bioinfo/packages/anaconda3/lib/python3.6/site-packages/isovar/effect_prediction.py", line 44, in predicted_coding_effects_with_mutant_sequence
if not transcript.complete:
File "/bioinfo/packages/anaconda3/lib/python3.6/site-packages/memoized_property.py", line 40, in fget_memoized
setattr(self, attr_name, fget(self))
File "/bioinfo/packages/anaconda3/lib/python3.6/site-packages/pyensembl/transcript.py", line 393, in complete
self.coding_sequence is not None and
File "/bioinfo/packages/anaconda3/lib/python3.6/site-packages/memoized_property.py", line 40, in fget_memoized
setattr(self, attr_name, fget(self))
File "/bioinfo/packages/anaconda3/lib/python3.6/site-packages/pyensembl/transcript.py", line 435, in coding_sequence
start = self.first_start_codon_spliced_offset
File "/bioinfo/packages/anaconda3/lib/python3.6/site-packages/memoized_property.py", line 40, in fget_memoized
setattr(self, attr_name, fget(self))
File "/bioinfo/packages/anaconda3/lib/python3.6/site-packages/pyensembl/transcript.py", line 414, in first_start_codon_spliced_offset
start_offsets = self.start_codon_spliced_offsets
File "/bioinfo/packages/anaconda3/lib/python3.6/site-packages/memoized_property.py", line 40, in fget_memoized
setattr(self, attr_name, fget(self))
File "/bioinfo/packages/anaconda3/lib/python3.6/site-packages/pyensembl/transcript.py", line 359, in start_codon_spliced_offsets
in self.start_codon_positions
File "/bioinfo/packages/anaconda3/lib/python3.6/site-packages/pyensembl/transcript.py", line 358, in
for position
File "/bioinfo/packages/anaconda3/lib/python3.6/site-packages/pyensembl/transcript.py", line 293, in spliced_offset
for exon in self.exons:
File "/bioinfo/packages/anaconda3/lib/python3.6/site-packages/pyensembl/transcript.py", line 121, in exons
feature="exon")
File "/bioinfo/packages/anaconda3/lib/python3.6/site-packages/pyensembl/common.py", line 67, in wrapped_fn
value = fn(*args, **kwargs)
File "/bioinfo/packages/anaconda3/lib/python3.6/site-packages/pyensembl/database.py", line 469, in query
sql, required=required, query_params=query_params)
File "/bioinfo/packages/anaconda3/lib/python3.6/site-packages/pyensembl/database.py", line 429, in run_sql_query
cursor = self.connection.execute(sql, query_params)
sqlite3.OperationalError: no such column: exon_id
end of the error message
Is there any fix for this problem?
Thanks,
Wei
The text was updated successfully, but these errors were encountered:
There is no exon_id in the mouse mm9 annotation.
head Mus_musculus.NCBIM37.67.gtf
NT_166433 protein_coding exon 11955 12166 . + . gene_id "ENSMUSG00000000702"; transcript_id "ENSMUST00000105216"; exon_number "1"; gene_name "AC007307.1"; gene_biotype "protein_coding"; transcript_name "AC007307.1-201";
NT_166433 protein_coding CDS 12026 12166 . + 0 gene_id "ENSMUSG00000000702"; transcript_id "ENSMUST00000105216"; exon_number "1"; gene_name "AC007307.1"; gene_biotype "protein_coding"; transcript_name "AC007307.1-201"; protein_id "ENSMUSP00000100851";
NT_166433 protein_coding start_codon 12026 12028 . + 0 gene_id "ENSMUSG00000000702"; transcript_id "ENSMUST00000105216"; exon_number "1"; gene_name "AC007307.1"; gene_biotype "protein_coding"; transcript_name "AC007307.1-201";
NT_166433 protein_coding exon 16677 16841 . + . gene_id "ENSMUSG00000000702"; transcript_id "ENSMUST00000105216"; exon_number "2"; gene_name "AC007307.1"; gene_biotype "protein_coding"; transcript_name "AC007307.1-201";
NT_166433 protein_coding CDS 16677 16841 . + 0 gene_id "ENSMUSG00000000702"; transcript_id "ENSMUST00000105216"; exon_number "2"; gene_name "AC007307.1"; gene_biotype "protein_coding"; transcript_name "AC007307.1-201"; protein_id "ENSMUSP00000100851";
NT_166433 protein_coding exon 17745 17814 . + . gene_id "ENSMUSG00000000702"; transcript_id "ENSMUST00000105216"; exon_number "3"; gene_name "AC007307.1"; gene_biotype "protein_coding"; transcript_name "AC007307.1-201";
So, I got error while try to running isovar (see below):
isovar-protein-sequences.py --vcf batch1-ensemble-annotated-decomposed.vcf --bam ERR420397.fastq.genome.sorted.bam --min-alt-rna-reads 2 --protein-sequence-length 30 --output rep1-results.csv --genome NCBIM37
2018-03-08 21:59:24,270 - isovar.allele_reads:200 - INFO - Gathering reads for Variant(contig='1', start=4336591, ref='', alt='T', reference_name='NCBIM37')
2018-03-08 21:59:24,289 - isovar.allele_reads:208 - INFO - Gathering variant reads for variant Variant(contig='1', start=4336591, ref='', alt='T', reference_name='NCBIM37') (chromosome = 1, gene names = ['Rp1'])
2018-03-08 21:59:24,303 - isovar.locus_reads:314 - INFO - Found 0 reads overlapping locus 1: 4336591-4336592
2018-03-08 21:59:24,303 - isovar.translation:461 - INFO - No supporting reads for variant Variant(contig='1', start=4336591, ref='', alt='T', reference_name='NCBIM37')
2018-03-08 21:59:24,303 - isovar.allele_reads:200 - INFO - Gathering reads for Variant(contig='1', start=5088129, ref='A', alt='C', reference_name='NCBIM37')
2018-03-08 21:59:24,305 - isovar.allele_reads:208 - INFO - Gathering variant reads for variant Variant(contig='1', start=5088129, ref='A', alt='C', reference_name='NCBIM37') (chromosome = 1, gene names = ['Atp6v1h'])
2018-03-08 21:59:24,318 - isovar.locus_reads:314 - INFO - Found 22 reads overlapping locus 1: 5088128-5088130
2018-03-08 21:59:24,318 - isovar.variant_sequences:422 - INFO - Initial pool of 2 variant sequences (min length=50, max length=50)
2018-03-08 21:59:24,318 - isovar.assembly:140 - INFO - Collapsed 2 -> 2 sequences
2018-03-08 21:59:24,318 - isovar.variant_sequences:440 - INFO - After overlap assembly: 1 variant sequences (min length=56, max length=56)
2018-03-08 21:59:24,320 - isovar.variant_sequences:207 - INFO - Coverage: [1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2 2 2 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1] (len=56)
2018-03-08 21:59:24,320 - isovar.variant_sequences:330 - INFO - Kept 1/1 variant sequences after read coverage trimming to >=2x
2018-03-08 21:59:24,320 - isovar.variant_sequences:456 - INFO - After coverage & length filtering: 1 variant sequences (min length=44, max length=44)
2018-03-08 21:59:24,636 - pyensembl.sequence_data:118 - INFO - Loaded sequence dictionary from /home/zhuwe/.cache/pyensembl/NCBIM37/ensembl67/Mus_musculus.NCBIM37.67.cdna.all.fa.gz.pickle
2018-03-08 21:59:24,653 - pyensembl.sequence_data:118 - INFO - Loaded sequence dictionary from /home/zhuwe/.cache/pyensembl/NCBIM37/ensembl67/Mus_musculus.NCBIM37.67.ncrna.fa.gz.pickle
2018-03-08 21:59:24,666 - pyensembl.database:436 - WARNING - Encountered error "no such column: exon_id" from query "
SELECT exon_number, exon_id
FROM exon
WHERE transcript_id = ?
" with parameters ['ENSMUST00000044369']
Traceback (most recent call last):
File "/bioinfo/packages/anaconda3/lib/python3.6/site-packages/pyensembl/common.py", line 65, in wrapped_fn
return cache[cache_key]
KeyError: (<pyensembl.database.Database object at 0x7fffec08ba20>, ('exon_number', 'exon_id'), ('feature', 'exon'), ('filter_column', 'transcript_id'), ('filter_value', 'ENSMUST00000044369'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/bioinfo/packages/anaconda3/bin/isovar-protein-sequences.py", line 46, in
df = protein_sequences_dataframe_from_args(args)
File "/bioinfo/packages/anaconda3/lib/python3.6/site-packages/isovar/cli/protein_sequences.py", line 73, in protein_sequences_dataframe_from_args
return protein_sequences_generator_to_dataframe(protein_sequences_generator)
File "/bioinfo/packages/anaconda3/lib/python3.6/site-packages/isovar/protein_sequences.py", line 323, in protein_sequences_generator_to_dataframe
gene=lambda x: ";".join(x)))
File "/bioinfo/packages/anaconda3/lib/python3.6/site-packages/isovar/dataframe_builder.py", line 189, in dataframe_from_generator
for variant, elements in variant_and_elements_generator:
File "/bioinfo/packages/anaconda3/lib/python3.6/site-packages/isovar/protein_sequences.py", line 279, in reads_generator_to_protein_sequences_generator
variant_sequence_assembly=variant_sequence_assembly)
File "/bioinfo/packages/anaconda3/lib/python3.6/site-packages/isovar/translation.py", line 492, in translate_variant_reads
transcript_id_whitelist=transcript_id_whitelist)
File "/bioinfo/packages/anaconda3/lib/python3.6/site-packages/isovar/reference_context.py", line 117, in reference_contexts_for_variant
transcript_id_whitelist=transcript_id_whitelist)
File "/bioinfo/packages/anaconda3/lib/python3.6/site-packages/isovar/effect_prediction.py", line 95, in reference_transcripts_for_variant
transcript_id_whitelist=transcript_id_whitelist)
File "/bioinfo/packages/anaconda3/lib/python3.6/site-packages/isovar/effect_prediction.py", line 44, in predicted_coding_effects_with_mutant_sequence
if not transcript.complete:
File "/bioinfo/packages/anaconda3/lib/python3.6/site-packages/memoized_property.py", line 40, in fget_memoized
setattr(self, attr_name, fget(self))
File "/bioinfo/packages/anaconda3/lib/python3.6/site-packages/pyensembl/transcript.py", line 393, in complete
self.coding_sequence is not None and
File "/bioinfo/packages/anaconda3/lib/python3.6/site-packages/memoized_property.py", line 40, in fget_memoized
setattr(self, attr_name, fget(self))
File "/bioinfo/packages/anaconda3/lib/python3.6/site-packages/pyensembl/transcript.py", line 435, in coding_sequence
start = self.first_start_codon_spliced_offset
File "/bioinfo/packages/anaconda3/lib/python3.6/site-packages/memoized_property.py", line 40, in fget_memoized
setattr(self, attr_name, fget(self))
File "/bioinfo/packages/anaconda3/lib/python3.6/site-packages/pyensembl/transcript.py", line 414, in first_start_codon_spliced_offset
start_offsets = self.start_codon_spliced_offsets
File "/bioinfo/packages/anaconda3/lib/python3.6/site-packages/memoized_property.py", line 40, in fget_memoized
setattr(self, attr_name, fget(self))
File "/bioinfo/packages/anaconda3/lib/python3.6/site-packages/pyensembl/transcript.py", line 359, in start_codon_spliced_offsets
in self.start_codon_positions
File "/bioinfo/packages/anaconda3/lib/python3.6/site-packages/pyensembl/transcript.py", line 358, in
for position
File "/bioinfo/packages/anaconda3/lib/python3.6/site-packages/pyensembl/transcript.py", line 293, in spliced_offset
for exon in self.exons:
File "/bioinfo/packages/anaconda3/lib/python3.6/site-packages/pyensembl/transcript.py", line 121, in exons
feature="exon")
File "/bioinfo/packages/anaconda3/lib/python3.6/site-packages/pyensembl/common.py", line 67, in wrapped_fn
value = fn(*args, **kwargs)
File "/bioinfo/packages/anaconda3/lib/python3.6/site-packages/pyensembl/database.py", line 469, in query
sql, required=required, query_params=query_params)
File "/bioinfo/packages/anaconda3/lib/python3.6/site-packages/pyensembl/database.py", line 429, in run_sql_query
cursor = self.connection.execute(sql, query_params)
sqlite3.OperationalError: no such column: exon_id
end of the error message
Is there any fix for this problem?
Thanks,
Wei
The text was updated successfully, but these errors were encountered: