Simple utility that estimates the read depth of a BAM file using only the .BAI index file
cat 1.bai | ./bamReadDepther
The output is in a simple format that makes it easy to parse when streaming. In the example below each #Number indicates a sequence with the number being the reference id. The reference id has two tab separated fields on the same line: mapped
and placed-unmapped
reads. This information is only provided if the index file was created with a relatively recent version of samtools. Following the reference id are a series of values each of which corresponding to a 16384 nucleotide bin. The first bin will be from nucleotide position 0-16384 and the next bin will be from nucleotide position 16385-32769 and so on. The values are the amount of bytes needed in the bam file to store all the reads that fall in a particular bin. For example the first bin in reference #1 below has reads that take up 1200 bytes of disk space, not 1200 reads. However the relationship between reads and byte space is very good and we can see where the depth is high and low even if we don't know the corresponding read depth for any particular point.
Also a special reference number "*" is available in some index files. This give the total unmapped unplaced reads for the entire file.
#1 35100 3100
1200
1104
1238
4329
3298
3249
2293
3293
4345
3450
#2 32100 2400
1200
1104
1238
4329
3298
3249
2293
3293
4345
3450
#* 43250