Releases: deeptools/deepTools
Releases · deeptools/deepTools
2.3.0
- Modified how normalization is done when filtering is used. Previously, the filtering wasn't taken into account when computing the total number of alignments. That is now being done. Note that this uses sampling and will try to sample at least 100000 alignments and see what fraction of them are filtered. The total number of aligned reads is then scaled accordingly (#309).
- Modified how normalization is done when a blacklist is used. Previously, the number of alignments overlapping a blacklisted region was subtracted from the total number of alignments in the file. This decreased things a bit too much, since only alignments falling completely within a blacklisted region are actually excluded completely (#312).
- BED12 and GTF files can now be used as input (issue #71). Additionally, multiBamSummary, multiBigwigSummary and computeMatrix now have a --metagene option, which allows summarization over concatenated exons, rather than include introns as well (this has always been the default). This was issue #76.
- Read extension is handled more accurately, such that if a read originates outside of a bin or BED/GTF region that it will typically be included if the --extendReads option is used and the extension would put it in a given bin/region.
- deepTools now uses a custom interval-tree implementation that allows including metadata, such as gene/transcript IDs, along with intervals. For those interested, the code for this available separately (https://github.com/dpryan79/deeptools_intervals) with the original C-only implementation here: https://github.com/dpryan79/libGTF.
- The API for the countReadsPerBin, getScorePerBigWigBin, and mapReduce modules has changed slightly (this was needed to support the --metagene option). Anyone using these in their own programs is encouraged to look at the modified API before upgrading.
- Added the
plotEnrichment
function (this was issue #329). - There is now a
subsetMatrix
script available that can be used to subset the output of computeMatrix. This is useful for preparing plots that only contain a subset of samples/region groups. Note that this isn't installed by default. - The Galaxy wrappers were updated to include the ability to exclude blacklisted regions.
- Most functions (both at the command line and within Galaxy) that process BAM files can now filter by fragment length (--minFragmentLength and --maxFragmentLength). By default there's no filtering performed. The primary purpose of this is to facilitate ATACseq analysis, where fragment length determines whether one is processing mono-/di-/poly-nucleosome fragments. This was issue #336.
- bamPEFragmentSize now has --logScale and --maxFragmentLength options, which allow you to plot frequencies on the log scale and set the max plotted fragment length, respectively. This was issue #337.
- --blackListFileName now accepts multiple files.
- bamPEFragmentSize now supports multiple input files.
- If the sequence has been removed from BAM files, SE reads no longer cause an error in bamCoverage if --normalizeTo1x is specified. In general, the code that looks at read length now checks the CIGAR string if there's no sequence available in a BAM file (for both PE and SE datasets). This was issue #369.
- bamCoverage now respects the --filterRNAstrand option when computing scaling factors. This was issue #353.
- computeMatrix and plotHeatmap can now sort using only a subset of samples
- There is now an --Offset option to bamCoverage, which allows having the signal at a single base. This is useful for things like RiboSeq or GROseq, where the goal is to get focal peaks at single bases/codons/etc.
- The --MNase option to
bamCoverage
now respects --minFragmentLength and --maxFragmentLength, with defaults set to 130 and 200.
2.2.4
2.2.3
Schnappszahl!
- Fixed labels when hierarchical clustering is used (they were off by one previously).
- Fixed a bug wherein bamCompare couldn't work with a blacklist
- Fixed yet another change in pysam, though at least in this case is was fixing a previous problem
2.2.1
- Fixed a bug introduced in version 2.2.0 wherein sometimes a pre-2.2.0 produced matrix file could no longer be used with plotHeatmap or plotProfile (this only happened when --outFileNameData was then used).
- Finally suppressed all of the runtime warnings that numpy likes to randomly throw.
- Worked around an undocumented change in pysam-0.9.0 that tended to break things.
2.2.0
- plotFingerprint now iterates through line styles as well as colors. This allows up to 35 samples per plot without repeating (not that that many would ever be recommended). This was issue #80.
- Fixed a number of Galaxy wrappers, which were rendered incorrectly due to including a section title of "Background".
- A number of image file handles were previously not explicitly closed, which caused occasional completion of a plot* program but without the files actually being there. This only happened on some NFS mount points.
- The Galaxy wrappers now support the
--outFileNameData
option on plotProfile and plotHeatmap. - Added support for blacklist regions. These can be supplied as a BED file and the regions will largely be skipped in processing (they'll also be ignored during normalization). This is very useful to skip regions known to attract excess signal. This was issue #101.
- Modified plotPCA to include the actual eigenvalues rather than rescaled ones. Also, plotPCA can now output the underlying values (issue #231).
- Regions within each feature body can now be unscaled when using
computeMatrix
. Thus, if you're interested in unscaled signal around the TSS/TES then you can now use the--unscaled5prime
and--unscaled3prime
options. This was issue #108. - bamCoverage now has a
--filterRNAstrand
option, that will produce coverage for only a single strand. Note that the strand referred to is the DNA strand and not sense/anti-sense. - Issues with plotHeatmap x-axis labels were fixed (issue #301).
2.1.1
- Fixed a how the --hclust option was handled in plotHeatmap/plotProfile. This gets around a quirk in scipy.
- A bug involving processing comment lines in BED files was corrected (issue #288)
- The Galaxy wrappers are now automatically tested with each modification.
- plotCoverage and plotFingerprint in Galaxy now accept 1 or more BAM files rather than at least 2 files.
2.1.0
- Updates to many of the Galaxy wrappers and associated documentation.
- A bug was fixed in how chromosome names were dealt with in bigWig files. If you ever received errors due to illegal intervals then that should now be fixed. This was issue #250
- plotProfile now has an --outFileNameData option for saving the underlying data in a text format.
- correctGCBias ensures that the resulting BAM file will pass picard/HTSJDK's validation if the input file did (issue #248)
- The default bin size was changed to 10, which is typically a bit more useful
- The --regionsLabel option to plotProfile and plotHeatmap now accepts a space-separated list, in line with --samplesLabel
- BAM files that have had their sequences stripped no longer cause an error
- bamPEFragmentSize now has -bs and -n options to allow adjusting the number of alignments sampled. Note that the default value is auto-adjusted if the sampling is too sparse.
- bamPEFragmentSize now accepts single-end files.
- The --hclust option to plotProfile and plotHeatmap continues even if one of the groups is too small for plotting (matplotlib will produce a warning that you can ignore). This was issue #280.
2.0.1
N.B., this is primarily a bug fix release.
- A critical bug that prevented plotPCA from running was fixed.
- multiBamCoverage was renamed to multiBamSummary, to be in better alignment with multiBigwigSummary.
- computeGCBias and correctGCBias are now more tolerant of chromosome name mismatches.
- multiBigwigSummary and multiBamSummary can accept a single bigWig/BAM input file, though one should use the
--outRawCounts argument.
deepTools 2.0
Major changes
computeMatrix
now accepts multiple bigwig files that can later be plotted together as heatmaps
one after the other or as multiple lines in the same plot. See the documentation ofplotHeatmap
andplotProfile
for examples.computeMatrix
also now accepts multiple input BED files. Each is treated as a group within a sample
and is plotted independently.- Added new analysis tool :doc:
tools/plotPCA
to visualize the results ofmultiBamCoverage
ortools/multiBigwigSummary
using principal component analysis. - Added new quality control tool
tools/plotCoverage
to plot the coverage over base pairs for multiple samples - Dramatically improved the speed of bigwig related tools (
multiBigwigSummary
andcomputeMatrix
)
by using the newpyBigWig module
. - Added support for split reads (most commonly found in RNA-seq data).
- Added new option
--MNase
inbamCoverage
that computes reads coverage only considering two
base pairs at the center of the fragment. - Added
--samFlagInclude
and--samFlagExclude
parameters. This is useful to for example
only include forward reads (or only reverse reads) in an analysis. - Plotting of correlations (from
multiBamCoverage
ormultiBigwigSummary
) is now
separated from the computation of the underlying data. A new tool,plotCorrelation
was added. This tool
can plot correlations as heatmaps or as scatter plots and includes options to adjust a large array of visual features. - Added hierarchical clustering, besides k-means to
plotProfile
andplotHeatmap
- Correlation coefficients can now be computed even if the data contains NaNs.
- The documentation was migrated to http://deeptools.readthedocs.org
- deepTools modules can now be used by other python programs. The :ref:
api
is now part of the documentation. - In this new release, most of the core code was rewriting to facilitate API usage and for optimization.
Minor changes
--missingDataAsZero
was renamed to--skipNonCoveredRegions
for clarity in :doc:tools/bamCoverage
and :doc:tools/bamCompare
.- Read extension was made optional and removed the need to specify a default fragment length for most of the tools.
and--fragmentLentgh parameters
were replaced by the new optional parameter--extendReads
. - Renamed:
- heatmapper to :doc:
tools/plotHeatmap
- profiler to :doc:
tools/plotProfile
- bamCorrelate to
multiBamCoverage
- bigwigCorrelate to
multiBigwigSummary
- bamFingerprint to :
plotFingerprint
.
- heatmapper to :doc:
- Improved plotting features for
plotProfile
when using as plot type: 'overlapped_lines' and 'heatmap' - Resolved an error introduced by numpy version 1.10 in :doc:
tools/computeMatrix
- Fixed problem with bed intervals in
multiBigwigSummary
andmultiBamCoverage
and a
user specified region that returned wrongly labeled raw counts. computeMatrix
can now read files with DOS newline characters.- Added option
--skipChromosomes
to :doc:tools/multiBigwigSummary
, for example to skip all
'random' chromosomes.multiBigwigSummary
now also considers chromosomes as identical
when the names between samples differ by 'chr' prefix 'chr'. E.g. chr1 vs. 1 - For :doc:
tools/bamCoverage
andbamCompare
, behaviour of scaleFactor was updated such that now,
if given in combination with the normalization options (normalize to 1x or normalize using RPKM) the given scaleFactor
will multiply the scale factor computed for the normalization methods. - Fixed problem with wrongly labeled proper pairs in a bam file. deepTools adds further checks to
determine if a read pair is a proper pair. - Added titles to QC plots,