higher than expected coverage in output bam #6

rhalperin · 2022-05-31T17:26:46Z

I ran bamsifter with -c 20, but in my output bam, I'm generally seeing coverage peaks around 50-60 reads. Here is an IGV screenshot, where the top track is the bamsifter output and the bottom track is the original bam

Is this what you would expect the output to look like?

The text was updated successfully, but these errors were encountered:

GeorgescuC · 2022-06-06T21:55:23Z

Hi @rhalperin ,

Bamsifter tries to always have the target number n of reads when there are enough reads and not go over, but there are multiple reasons why the coverage can often exceed the target.
Generally, the first n reads of a covered region will be automatically selected, then when one of those reads ends, a new one is selected to keep the coverage high enough. However when some of the selected reads do not align to part of the reference such as when splicing or deletions occur (as identified in the CIGAR string), if the coverage for a given base pair drops below the target n, we select more reads until we reach the threshold again to make up for that.
When using paired end reads, each selected read also automatically selects its pair so that we keep as much relevant information as possible. This does in counterpart mean that if we already reached the target n coverage and find the pair of an already selected read, we will go over the target n coverage to keep it.
There is also the option (disabled by default) to keep all chimeric reads (useful if you are looking for fusion transcripts) that can contribute to this.
For efficiency, the input bam is only read through once when selecting reads, the once when copying them, so we don't go back and try to correct oversampling.

It is however possible that the higher coverage you see is not explained by any of these reasons, in which case I can take a look at the specific issue if you can share an example BAM (the specific region from your screenshot for example).

Regards,
Christophe.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

higher than expected coverage in output bam #6

higher than expected coverage in output bam #6

rhalperin commented May 31, 2022

GeorgescuC commented Jun 6, 2022

higher than expected coverage in output bam #6

higher than expected coverage in output bam #6

Comments

rhalperin commented May 31, 2022

GeorgescuC commented Jun 6, 2022