Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IndexError: list index out of range #41

Open
lfearnley opened this issue Jun 13, 2024 · 3 comments
Open

IndexError: list index out of range #41

lfearnley opened this issue Jun 13, 2024 · 3 comments

Comments

@lfearnley
Copy link

I am running STRaglr 1.5.0 (the current release) and get the following error on multiple CRAMs:

python straglr.py NA19676.hg38.cram ../1KG_ONT_VIENNA_hg38.fa NA19676.hg38_straglr_patho --loci ../20240328_straglr_catalog.bed 
Traceback (most recent call last):
  File "/vast/scratch/users/fearnley.l/1KG_ONT_VIENNA/straglr/straglr.py", line 101, in <module>
    main()
  File "/vast/scratch/users/fearnley.l/1KG_ONT_VIENNA/straglr/straglr.py", line 98, in main
    tre_finder.output_vcf(variants, '{}.vcf'.format(args.out_prefix))
  File "/vast/scratch/users/fearnley.l/1KG_ONT_VIENNA/straglr/src/tre.py", line 1513, in output_vcf
    fails = Variant.find_fails(variants)
  File "/vast/scratch/users/fearnley.l/1KG_ONT_VIENNA/straglr/src/variant.py", line 244, in find_fails
    failed_reason = Counter(failed_reasons).most_common(1)[0][0]
IndexError: list index out of range

Any suggestions as to what might cause this?

@readmanchiu
Copy link
Collaborator

The error is caused by the lack of coverage at a given locus. An example case is when there are only 2 support reads for a given locus and each has a different repeat size. And if the min_support is set at 2, no allele can be formulated with minimum support.
The new version that produces a VCF output tries to associate a FILTER each failed locus. As I wasn't able to anticipate such scenario, I did not generate a failed reason for such scenario and therefor the script crashed.
I have made a fix that would produce a CLUSTERING_FAILED filter for such scenario and will release it shortly.
In the meantime, if you want to get past this, you could set --min_cluster_size 1 and the program should be able to finish.
Thanks very much for reporting this bug.

@zaka-edd
Copy link

zaka-edd commented Jul 5, 2024

Hi, I have been having the same issue. I tried changing --min_cluster_size 1, but it did not fix the error for me. Do you know another problem that could be the cause. I ran:

straglr.py map-sminimap2-HG002_hg38_chr21.bam .../chr21_test_data/chr21.fa output_straglr --loci HG002_repeats_straglr.bed --min_cluster_size 1

  Traceback (most recent call last):
    File "/usr/local/bin/straglr.py", line 101, in <module>
      main()
    File "/usr/local/bin/straglr.py", line 93, in main
      variants = tre_finder.genotype(args.loci)
    File "/usr/local/lib/python3.10/site-packages/src/tre.py", line 1426, in genotype
      return self.collect_alleles(loci)
    File "/usr/local/lib/python3.10/site-packages/src/tre.py", line 1402, in collect_alleles
      tre_variants = self.get_alleles(loci)
    File "/usr/local/lib/python3.10/site-packages/src/tre.py", line 1252, in get_alleles
      self.update_refs(variants, genome_fasta)
    File "/usr/local/lib/python3.10/site-packages/src/tre.py", line 1271, in update_refs
      refs = self.extract_refs_trf(trf_input)
    File "/usr/local/lib/python3.10/site-packages/src/tre.py", line 607, in extract_refs_trf
      data_motif = cols[3]
  IndexError: list index out of range

@readmanchiu
Copy link
Collaborator

This is a different problem. Looks like there is something wrong when the script parsed the results from the TRF run.
Can you try running with --tmpdir <path> --debug, where <path> can be set to your output directory. This way the temporary files will be kept. I want to see if there is anything wrong with the latest ***.dat (TRF output) created.
You can first check the TRF output is there. If you only have a few loci, maybe you can post the content of the .dat file? Or you can attach the file for me to examine.
Best if you can start a new issue for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants