-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Repeat Interruptions #28
Comments
I guess you are interested in running Straglr in targeted mode i.e. passing the locus info to |
Dear @readmanchiu , Which of the above strategies would you advice? |
Thanks for trying Straglr, @MaestSi |
Hi, I just did a quick test with Straglr 1.4.1, specifying
Do you know if there is a way to solve this? |
did you run minimap2 with soft-clipping ( |
Dear @readmanchiu, actually I forgot to run minimap2 with |
if you don't mind posting the SAM records of the 2 alignments, it may help me understand why the supplementary alignment was picked over the primary one. |
Hi @readmanchiu sorry for my late response. I tried the solution with regex and it seems to work for HTT. There is a repeat on chr1:1371179-1371198 in VWA1 (GGCGCGGAGC). When I define exactly this region in the loci bed file I am getting a different repeat as output:
When I define this loci with al little padding e.g. chr1:1371170-1371250 I am getting the correct repeat:
The command I used was: Do I always have to put a little padding arround the repeat regions defined in loci file? Or am I doing something wrong? Bests Stefan |
Hi @readmanchiu , actually I realised every time a read has multiple alignments, all of them are reported in Straglr output file, while for my use case it would be better to report only the primary alignment. I mean, soft-clipped sequences, in this context, are usually STR which may align to many other genomic regions, but I don't care about their genomic location, as I only want their sequence (i.e. the soft-clipped of the primary alignment) to be used once. P.s.: similarly to what @stefandiederich is experiencing, I am also getting slightly different repeat counts (e.g. 6 instead of 7) in case I also include a couple of flanking bases in the bed file or not. Are there any guidances or criteria for this? |
Hi @stefandiederich |
@MaestSi, one read should only be used once as support for one locus despite it has multiple alignments. The same read may be used as support for more than 1 locus though. But if you notice a read that is used as support for loci on different chromosomes, then that shouldn't happen. I'll need some data to debug that. |
Hi,
I was wondering if stragler is able to deal with repeat interruption.
For some diseases we can see interruptions of the repeats. Like for Huntington, where the normal repeat CAGCAGCAG... can be interrupted by CAA. Is there a way to pass this to stragler?
Bests
Stefan
The text was updated successfully, but these errors were encountered: