Skip to content

Commit

Permalink
fix merge conflict
Browse files Browse the repository at this point in the history
  • Loading branch information
fidelram committed Feb 5, 2014
2 parents b555f6c + abf83ee commit c6a4a33
Show file tree
Hide file tree
Showing 18 changed files with 301 additions and 62 deletions.
1 change: 0 additions & 1 deletion README.txt

This file was deleted.

197 changes: 197 additions & 0 deletions README.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,197 @@
======================================================================
deepTools
======================================================================
### user-friendly tools for the normalization and visualization of deep-sequencing data


deepTools addresses the challenge of handling the large amounts of data
that are now routinely generated from DNA sequencing centers. To do so, deepTools contains useful modules to process the mapped reads data to create coverage files in standard bedGraph and bigWig file formats. By doing so, deepTools allows the creation of **normalized coverage files** or the comparison between two files (for example, treatment and control). Finally, using such normalized and standardized files, multiple
**visualizations** can be created to identify enrichments with
functional annotations of the genome. For a gallery of images that
can be produced, see
http://f1000.com/posters/browse/summary/1094053

For support, questions, or feature requests contact: [email protected]

![gallery](https://raw.github.com/fidelram/deepTools/master/examples/collage.png)

Our [wiki page](https://github.com/fidelram/deepTools/wiki) contains more information on **why we built deepTools**, details on the **individual tool scopes and usages** and an introduction to our deepTools Galaxy web server. It also contains an [FAQ section](FAQ) that we update regularly. For more specific troubleshooting, feedback, and tool suggestions, contact us via [email protected]


-------------------------------------------------------------------------------------------------------------------

<a name="installation"/></a>
Installation
---------------

deepTools are available for:

* command line usage
* integration into Galaxy servers

Details on the installation routines can be found here.

[Installation from source](#linux)

[Installation on a Mac](#mac)

[Troubleshooting](#trouble)

[Galaxy installation](#galaxy)


<a name="linux"/></a>
### Installation from source (Linux, command line)

The easiest way to install deepTools is by __downloading the source file and using python pip__ or easy_install tools:

Requirements: Python 2.7, numpy, scipy installed

Commands:

$ cd ~
$ export PYTHONPATH=$PYTHONPATH:~/lib/python2.7/site-packages
$ export PATH=$PATH:~/bin:~/.local/bin

If pip is not already available, install with:

$ easy_install --prefix=~ pip

Install deepTools and dependencies with pip:

$ pip install --user deeptools
Done.




__Another option is to clone the repository:__

$ git clone https://github.com/fidelram/deepTools

Then go to the deepTools directory, edit the `deepTools.cfg`
file and then run the install script a:

$ cd deepTools
$ vim deeptools/config/deepTools.cfg
$ python setup.py install


By default, the script will install python library and executable
codes globally, which means you need to be root or administrator of
the machine to complete the installation. If you need to
provide a nonstandard install prefix, or any other nonstandard
options, you can provide many command line options to the install
script.

$ python setup.py --help

To install under a specific location use:

$ python setup.py install --prefix <target directory>

<a name="mac"></a>
### Installation on a MAC

Although the installation of deepTools itself is quite simple,
the installation of the required modules SciPy and NumPy demand
a bit of extra work.

The easiest way to install them ois together with the
[Anaconda Scientific Python Distribution][]. After installation, open
a terminal ("Applications" --> "Terminal"): and type:

$ pip install deeptools

If individual installation of the dependencies is preferred, follow
those steps:

Requirement: Python 2.7 installed

Download the packages and install them using dmg images:
- http://sourceforge.net/projects/numpy/files/NumPy/
- http://sourceforge.net/projects/scipy/files/scipy/

Then install deepTools via the terminal ("Applications" --> "Terminal"):

$ cd ~
$ export PYTHONPATH=$PYTHONPATH:~/lib/python2.7/site-packages
$ export PATH=$PATH:~/bin:~/.local/bin:~/Library/Python/2.7/bin

If pip is not already available, install with:

$ easy_install --prefix=~ pip

Install deepTools and dependencies with pip:

$ pip install --user deeptools


<a name="trouble"/></a>
##### Troubleshooting
The easy_install command is provided by the python package setuptools.
You can download the package from https://pypi.python.org/pypi/setuptools

$ wget https://bitbucket.org/pypa/setuptools/raw/bootstrap/ez_setup.py -O - | python

or the user-specific way:

$ wget https://bitbucket.org/pypa/setuptools/raw/bootstrap/ez_setup.py
$ python ez_setup.py --user

Numpy/Scipy Installation:
http://www.scipy.org/install.html

<a name="galaxy"/></a>
#### Galaxy Installation

deepTools can be easily integrated into [Galaxy](http://galaxyproject.org). All wrappers and dependencies are
available in the [Galaxy Tool Shed](http://toolshed.g2.bx.psu.edu/view/bgruening/deeptools).


##### Installation via Galaxy API (recommended)

At first generate an [API Key](http://wiki.galaxyproject.org/Admin/API#Generate_the_Admin_Account_API_Key) for your admin
user and run the the installation script:

python ./scripts/api/install_tool_shed_repositories.py --api YOUR_API_KEY -l http://localhost:8080 --url http://toolshed.g2.bx.psu.edu/ -o bgruening -r <revision> --name deeptools --tool-deps --repository-deps --panel-section-name deepTools

The -r argument specifies the version of deepTools. You can get the latest revsion number from the test tool shed or with the following command:

hg identify http://toolshed.g2.bx.psu.edu/view/bgruening/deeptools

You can watch the installation status under: Top Panel → Admin → Manage installed tool shed repositories


##### Installation via webbrowser

- go to the [admin page](http://localhost:8080/admin)
- select *Search and browse tool sheds*
- Galaxy tool shed → Sequence Analysis → deeptools
- install deeptools

remember: for support, questions, or feature requests contact: [email protected]

------------------------------------
[BAM]: https://docs.google.com/document/d/1Iv9QnuRYWCtV_UCi4xoXxEfmSZYQNyYJPNsFHnvv9C0/edit?usp=sharing "binary version of a SAM file; contains all information about aligned reads"
[SAM]: https://docs.google.com/document/d/1Iv9QnuRYWCtV_UCi4xoXxEfmSZYQNyYJPNsFHnvv9C0/edit?usp=sharing "text file containing all information about aligned reads"
[bigWig]: https://docs.google.com/document/d/1Iv9QnuRYWCtV_UCi4xoXxEfmSZYQNyYJPNsFHnvv9C0/edit?usp=sharing "binary version of a bedGraph file; contains genomic intervals and corresponding scores, e.g. average read numbers per 50 bp"
[bedGraph]: https://docs.google.com/document/d/1Iv9QnuRYWCtV_UCi4xoXxEfmSZYQNyYJPNsFHnvv9C0/edit?usp=sharing "text file that contains genomic intervals and corresponding scores, e.g. average read numbers per 50 bp"
[FASTQ]: https://docs.google.com/document/d/1Iv9QnuRYWCtV_UCi4xoXxEfmSZYQNyYJPNsFHnvv9C0/edit?usp=sharing "text file of raw reads (almost straight out of the sequencer)"

[bamCorrelate]: https://github.com/fidelram/deepTools/wiki/QC#wiki-bamCorrelate
[bamFingerprint]: https://github.com/fidelram/deepTools/wiki/QC#wiki-bamFingerprint
[computeGCBias]: https://github.com/fidelram/deepTools/wiki/QC#wiki-computeGCbias
[bamCoverage]: https://github.com/fidelram/deepTools/wiki/Normalizations#wiki-bamCoverage
[bamCompare]: https://github.com/fidelram/deepTools/wiki/Normalizations#wiki-bamCompare
[computeMatrix]: https://github.com/fidelram/deepTools/wiki/Visualizations
[heatmapper]: https://github.com/fidelram/deepTools/wiki/Visualizations
[profiler]: https://github.com/fidelram/deepTools/wiki/Visualizations

[Benjamini and Speed]: http://nar.oxfordjournals.org/content/40/10/e72 "Nucleic Acids Research (2012)"
[Diaz et al.]: http://www.degruyter.com/view/j/sagmb.2012.11.issue-3/1544-6115.1750/1544-6115.1750.xml "Stat. Appl. Gen. Mol. Biol. (2012)"
[Anaconda Scientific Python Distribution]: https://store.continuum.io/cshop/anaconda/

This tool suite is developed by the [Bioinformatics Facility](http://www1.ie-freiburg.mpg.de/bioinformaticsfac) at the [Max Planck Institute for Immunobiology and Epigenetics, Freiburg](http://www1.ie-freiburg.mpg.de/).

[Wiki Start Page](https://github.com/fidelram/deepTools/wiki) | [deepTools Galaxy](http://deeptools.ie-freiburg.mpg.de) | [FAQ](https://github.com/fidelram/deepTools/wiki/FAQ)
23 changes: 11 additions & 12 deletions bin/bamCompare
Original file line number Diff line number Diff line change
Expand Up @@ -25,9 +25,9 @@ def parseArguments(args=None):
parents=[parentParser, bamParser, outputParser],
formatter_class=argparse.ArgumentDefaultsHelpFormatter,
description='This tool compares two BAM files based on the number of '
'mapped reads. To compare the BAM files the genome is partitioned '
'mapped reads. To compare the BAM files, the genome is partitioned '
'into bins of equal size, then the number of reads found in each BAM '
'file are counted for such bins and finally a summarizing value is '
'file is counted for such bins and finally a summarizing value is '
'reported. This vaule can be the ratio of the number of reads per '
'bin, the log2 of the ratio or the difference. This tool can '
'normalize the number of reads on each BAM file using the SES method '
Expand All @@ -36,7 +36,7 @@ def parseArguments(args=None):
'and molecular biology, 11(3). Normalization based on read counts '
'is also available. The output is either a bedgraph or a bigwig file '
'containing the bin location and the resulting comparison values. By '
'default if reads are mated the fragment length reported in the BAM '
'default, if reads are mated, the fragment length reported in the BAM '
'file is used.')

# define the arguments
Expand All @@ -60,7 +60,7 @@ def parseArguments(args=None):

parser.add_argument('--bamIndex2', '-bai2',
help='Index for the bam file1. Default is to consider '
'a the path of the bam file adding the .bai suffix.',
'the path of the bam file adding the .bai suffix.',
metavar='bam file index')

parser.add_argument('--scaleFactorsMethod',
Expand Down Expand Up @@ -92,7 +92,7 @@ def parseArguments(args=None):
required=False)

parser.add_argument('--ratio',
help='The default output the log2ratio between the '
help='The default is to output the log2ratio between the '
'two samples. The reciprocal ratio returns the '
'the negative of the inverse of the ratio '
'if the ratio is less than 0. The resulting '
Expand All @@ -103,14 +103,14 @@ def parseArguments(args=None):
required=False)

parser.add_argument('--normalizeTo1x',
help='only when --ratio subtract Report normalized '
'coverage to 1x sequencing depth. Sequencing dept is '
help='(only when --ratio subtract) Report normalized '
'coverage to 1x sequencing depth. Sequencing depth is '
'defined as the total number of mapped reads*fragment '
'length / effective genome size. To use this option, '
'the effective genome size has to be given. Common '
'values are: mm9: 2150570000, hg19:2451960000, '
'dm3:121400000 and ce10:93260000. The default is '
'not to use any normalization. ',
'not to use any normalization.',
default=None,
type=int,
required=False)
Expand All @@ -120,7 +120,7 @@ def parseArguments(args=None):
'normalize the number of reads per bin. The formula '
'is: RPKM (per bin)=#reads per bin / ( # of mapped '
'reads (millions) * bin length (KB) ). This is the '
'defalt normalization method.',
'default normalization method.',
action='store_true',
required=False)

Expand All @@ -144,7 +144,7 @@ def parseArguments(args=None):
help='A list of chromosome names '
'separated by comma and limited by quotes, '
'containing those '
'chromosomes that want to be excluded '
'chromosomes that you want to be excluded '
'for computing the normalization. For example, '
' --ignoreForNormalization "chrX, chrM" ')

Expand Down Expand Up @@ -173,8 +173,7 @@ def main(args):
"""
The algorithm is composed of two parts.
1. Using the SES or mapped reads method.
Appropiate scaling factors are determined.
1. Using the SES or mapped reads method, appropiate scaling factors are determined.
2. The genome is transversed, scaling the BAM files, and computing
the log ratio/ratio/difference for bins of fixed width
Expand Down
2 changes: 1 addition & 1 deletion bin/bamCorrelate
Original file line number Diff line number Diff line change
Expand Up @@ -232,7 +232,7 @@ def plotCorrelation(corr_matrix, labels, plotFileName, vmax=None,
import scipy.cluster.hierarchy as sch
M = corr_matrix.shape[0]

# set the minumun and maximun values
# set the minimum and maximum values
if vmax is None:
vmax = 1
if vmin is None:
Expand Down
2 changes: 1 addition & 1 deletion bin/correctGCBias
Original file line number Diff line number Diff line change
Expand Up @@ -107,7 +107,7 @@ def getRequiredArgs():
output = parser.add_argument_group('Output options')
output.add_argument('--correctedFile', '-o',
help='Name of the corrected file. The ending will '
'be used to decide the ouput file format. The options '
'be used to decide the output file format. The options '
'are ".bam", ".bw" for a bigWig file, ".bg" for a '
'bedgraph file.',
metavar='FILE',
Expand Down
2 changes: 1 addition & 1 deletion bin/estimateScaleFactor
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ def parseArguments(args=None):

parser.add_argument('--numberOfProcessors', '-p',
help='Number of processors to use. The default is '
'to use half the maximun number of processors.',
'to use half the maximum number of processors.',
metavar="INT",
type=numberOfProcessors,
default="max/2",
Expand Down
2 changes: 1 addition & 1 deletion deeptools/SES_scaleFactor.py
Original file line number Diff line number Diff line change
Expand Up @@ -89,7 +89,7 @@ def estimateScaleFactor(bamFilesList, binLength, numberOfSamples,
p = np.sort(num_reads_per_bin[0, :]).cumsum()
q = np.sort(num_reads_per_bin[1, :]).cumsum()

# p[-1] and q[-1] are the maximun values in the arrays.
# p[-1] and q[-1] are the maximum values in the arrays.
# both p and q are normalized by this value
diff = np.abs(p / p[-1] - q / q[-1])
# get the lowest rank for wich the difference is the maximum
Expand Down
2 changes: 1 addition & 1 deletion deeptools/heatmapper.py
Original file line number Diff line number Diff line change
Expand Up @@ -422,7 +422,7 @@ def coverageFromBigWig(bigwig, chrom, zones, binSize, avgType,
"""
uses bigwig file reader from bx-python
to query a region define by chrom and zones.
The ouput is an array that contains the bigwig
The output is an array that contains the bigwig
value per base pair. The summary over bins is
done in a later step when coverageFromArray is called.
This method is more reliable than quering the bins
Expand Down
12 changes: 5 additions & 7 deletions deeptools/parserCommon.py
Original file line number Diff line number Diff line change
Expand Up @@ -101,7 +101,7 @@ def getParentArgParse(args=None, binSize=True):

if binSize:
optional.add_argument('--binSize', '-bs',
help='Size of the bins in bp for the ouput '
help='Size of the bins in bp for the output '
'of the bigwig/bedgraph file.',
metavar="INT bp",
type=int,
Expand All @@ -119,7 +119,7 @@ def getParentArgParse(args=None, binSize=True):

optional.add_argument('--numberOfProcessors', '-p',
help='Number of processors to use. Type "max/2" to '
'use half the maximun number of processors or "max" '
'use half the maximum number of processors or "max" '
'to use all available processors.',
metavar="INT",
type=numberOfProcessors,
Expand Down Expand Up @@ -322,15 +322,13 @@ def computeMatrixOptArgs(case=['scale-regions', 'reference-point'][0]):
type=int,
metavar='INT bp',
help='Distance upstream of the reference-point '
'selected.',
required=True)
'selected.')
optional.add_argument('--afterRegionStartLength', '-a', '--downstream',
default=1500,
metavar='INT bp',
type=int,
help='Distance downstream of the '
'reference-point selected.',
required=True)
'reference-point selected.')
optional.add_argument('--nanAfterEnd',
action='store_true',
help='If set, any values after the region end '
Expand Down Expand Up @@ -420,7 +418,7 @@ def computeMatrixOptArgs(case=['scale-regions', 'reference-point'][0]):
default=1)
optional.add_argument('--numberOfProcessors', '-p',
help='Number of processors to use. Type "max/2" to '
'use half the maximun number of processors or "max" '
'use half the maximum number of processors or "max" '
'to use all available processors.',
metavar="INT",
type=numberOfProcessors,
Expand Down
Binary file added examples/Gal_FAQ_IGV_dataset.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified examples/heatmaps_kmeans_Pol_II_small.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion galaxy/bigwigCompare.xml
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@
<when value="yes">
<param name="binSize" type="integer" value="50" min="1"
label="Bin size in bp"
help="Size of the bins in bp for the ouput of the bigwig/bedgraph file "/>
help="Size of the bins in bp for the output of the bigwig/bedgraph file "/>

<param name="missingDataAsZero" type="boolean" truevalue="yes" falsevalue="no" checked="True"
label ="Treat missing data as zero"
Expand Down
Loading

0 comments on commit c6a4a33

Please sign in to comment.