-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possibility of filtering reads based on phred score of selected basepairs #179
Comments
Thanks for using CRISPResso2! I don't think this is currently implemented, but seems like it would be a neat feature. @kclem may have more ideas on how to currently do this. If I understand correctly, the feature would take minimum phred scores for specific basepair positions and then filter the reads that have at least those scores, right? |
Thanks for the reply. Yes, that is exactly what i mean! |
Briefly, I think this would have to be done post-alignment and post-processing via the I don't think this is a common use case, so probably just writing a post-processing script would be most useful. What outputs are you looking for? Number of edited vs unedited reads after filtering for quality at those positions? |
Thanks for the reply! I am looking to compare the ratio of the two single base pair substitutions (i.e. number of reads of one compared to the other). They are both present at low frequencies and therefore i want to add stringent requirements for quality of those exact base pairs to minimize effect of sequencing noise. |
Dear @kclem and @Colelyman, Thank you for your answers. I'm working together with @caspl1 on his project. We are implementing a post-processing script and would like to make sure that we are in the right direction, because so far we can't manage to obtain the same number of fragments that are given by CRISPResso2. The input data are fastq files from a KI experiment, where there are two single base pair substitutions introduced (ie. two amplicons of interest, each with one single base pair substitution). We would like to use the output fastq file to:
But, so far, we can't manage to reproduce the same numbers that CRISPResso2 outouts in the first part above. Perhaps, since you know better the CRIPSPResso2 algorithm, you can help us with how to identify the fragments of interest. So far, the steps we follow in our script are: For each sequenced fragment:
Best regards, |
Hi.
Thanks for a great pipeline!
I have a population of cells where cells with two different knock-in single nucleotide substitutions are present amongst unedited cells. I would like to quantify the number of reads with each of these substitutions using CRISPResso2. I would like to apply stringent phred score quality filtering, not to the whole read, but only to these two basepair positions (or if that is not possible the window containing both of them, since they are located quite close to each other).
If i filter using the min_average_read_quality parameter, i will not be sure that these positions contain high quality base calls. If i filter using min_single_bp_quality, i will discard too many reads. And if i filter using min_bp_quality_or_N, the total number of reads with each single nucleotide substitution is not easily accessible in the allele plot since the N's are located different places in the reads and the reads are then not identical and counted together.
Is there a way to do this or would it be possible for you to make it?
Thanks a lot
/Casper
The text was updated successfully, but these errors were encountered: