fix merge conflict

deeptools · Feb 5, 2014 · c6a4a33 · c6a4a33
2 parents b555f6c + abf83ee
commit c6a4a33
Show file tree

Hide file tree

Showing 18 changed files with 301 additions and 62 deletions.
diff --git a/README.txt b/README.txt
diff --git a/README.txt b/README.txt
@@ -0,0 +1,197 @@
+======================================================================
+deepTools
+======================================================================
+### user-friendly tools for the normalization and visualization of deep-sequencing data
+
+
+deepTools addresses the challenge of handling the large amounts of data 
+that are now routinely generated from DNA sequencing centers. To do so, deepTools contains useful modules to process the mapped reads data to create coverage files in standard bedGraph and bigWig file formats. By doing so, deepTools allows the creation of **normalized coverage files** or the comparison between two files (for example, treatment and control). Finally, using such normalized and standardized files, multiple
+**visualizations** can be created to identify enrichments with
+functional annotations of the genome. For a gallery of images that
+can be produced, see
+http://f1000.com/posters/browse/summary/1094053
+
+For support, questions, or feature requests contact: [email protected]
+
+![gallery](https://raw.github.com/fidelram/deepTools/master/examples/collage.png)
+
+Our [wiki page](https://github.com/fidelram/deepTools/wiki) contains more information on **why we built deepTools**, details on the **individual tool scopes and usages** and an introduction to our deepTools Galaxy web server. It also contains an [FAQ section](FAQ) that we update regularly. For more specific troubleshooting, feedback, and tool suggestions, contact us via [email protected]
+
+
+-------------------------------------------------------------------------------------------------------------------
+
+<a name="installation"/></a>
+Installation
+---------------
+
+deepTools are available for:
+
+* command line usage
+* integration into Galaxy servers
+
+Details on the installation routines can be found here.
+
+[Installation from source](#linux)
+
+[Installation on a Mac](#mac)
+
+[Troubleshooting](#trouble)
+
+[Galaxy installation](#galaxy)
+
+
+<a name="linux"/></a>
+### Installation from source (Linux, command line)
+
+The easiest way to install deepTools is by __downloading the source file and using python pip__ or easy_install tools:
+
+Requirements: Python 2.7, numpy, scipy installed
+
+Commands:
+
+      $ cd ~
+      $ export PYTHONPATH=$PYTHONPATH:~/lib/python2.7/site-packages
+      $ export PATH=$PATH:~/bin:~/.local/bin
+
+If pip is not already available, install with:
+
+      $ easy_install --prefix=~ pip
+
+Install deepTools and dependencies with pip:
+
+      $ pip install --user deeptools
+Done.
+
+
+
+
+__Another option is to clone the repository:__
+
+	$ git clone https://github.com/fidelram/deepTools
+
+Then go to the deepTools directory, edit the `deepTools.cfg` 
+file and then run the install script a:
+
+	$ cd deepTools
+	$ vim deeptools/config/deepTools.cfg
+	$ python setup.py install
+
+
+By default, the script will install python library and executable
+codes globally, which means you need to be root or administrator of
+the machine to complete the installation. If you need to
+provide a nonstandard install prefix, or any other nonstandard
+options, you can provide many command line options to the install
+script.
+
+	$ python setup.py --help
+
+To install under a specific location use:
+
+	$ python setup.py install --prefix <target directory>
+
+<a name="mac"></a>
+### Installation on a MAC
+
+Although the installation of deepTools itself is quite simple,
+the installation of the required modules SciPy and NumPy demand
+a bit of extra work.
+
+The easiest way to install them ois together with the
+[Anaconda Scientific Python Distribution][]. After installation, open
+a terminal ("Applications" --> "Terminal"): and type:
+
+     $ pip install deeptools
+
+If individual installation of the dependencies is preferred, follow 
+those steps:
+
+Requirement: Python 2.7 installed
+
+Download the packages and install them using dmg images:
+- http://sourceforge.net/projects/numpy/files/NumPy/
+- http://sourceforge.net/projects/scipy/files/scipy/
+
+Then install deepTools via the terminal ("Applications" --> "Terminal"):
+
+     $ cd ~
+     $ export PYTHONPATH=$PYTHONPATH:~/lib/python2.7/site-packages
+     $ export PATH=$PATH:~/bin:~/.local/bin:~/Library/Python/2.7/bin
+
+If pip is not already available, install with:
+
+     $ easy_install --prefix=~ pip
+
+Install deepTools and dependencies with pip:
+
+     $ pip install --user deeptools
+
+
+<a name="trouble"/></a>
+##### Troubleshooting
+The easy_install command is provided by the python package setuptools.
+You can download the package from https://pypi.python.org/pypi/setuptools
+
+     $ wget https://bitbucket.org/pypa/setuptools/raw/bootstrap/ez_setup.py -O - | python
+
+or the user-specific way:
+
+     $ wget https://bitbucket.org/pypa/setuptools/raw/bootstrap/ez_setup.py
+     $ python ez_setup.py --user
+
+Numpy/Scipy Installation:
+http://www.scipy.org/install.html
+
+<a name="galaxy"/></a>
+#### Galaxy Installation
+
+deepTools can be easily integrated into [Galaxy](http://galaxyproject.org). All wrappers and dependencies are 
+available in the [Galaxy Tool Shed](http://toolshed.g2.bx.psu.edu/view/bgruening/deeptools).
+
+
+##### Installation via Galaxy API (recommended)
+
+At first generate an [API Key](http://wiki.galaxyproject.org/Admin/API#Generate_the_Admin_Account_API_Key) for your admin 
+user and run the the installation script:
+
+	python ./scripts/api/install_tool_shed_repositories.py --api YOUR_API_KEY -l http://localhost:8080 --url http://toolshed.g2.bx.psu.edu/ -o bgruening -r <revision> --name deeptools --tool-deps --repository-deps --panel-section-name deepTools
+
+The -r argument specifies the version of deepTools. You can get the latest revsion number from the test tool shed or with the following command:
+
+	hg identify http://toolshed.g2.bx.psu.edu/view/bgruening/deeptools
+
+You can watch the installation status under: Top Panel → Admin → Manage installed tool shed repositories
+
+
+##### Installation via webbrowser
+
+- go to the [admin page](http://localhost:8080/admin)
+- select *Search and browse tool sheds*
+- Galaxy tool shed → Sequence Analysis → deeptools
+- install deeptools
+
+remember: for support, questions, or feature requests contact: [email protected]
+
+------------------------------------
+[BAM]: https://docs.google.com/document/d/1Iv9QnuRYWCtV_UCi4xoXxEfmSZYQNyYJPNsFHnvv9C0/edit?usp=sharing "binary version of a SAM file; contains all information about aligned reads"
+[SAM]: https://docs.google.com/document/d/1Iv9QnuRYWCtV_UCi4xoXxEfmSZYQNyYJPNsFHnvv9C0/edit?usp=sharing "text file containing all information about aligned reads"
+[bigWig]: https://docs.google.com/document/d/1Iv9QnuRYWCtV_UCi4xoXxEfmSZYQNyYJPNsFHnvv9C0/edit?usp=sharing "binary version of a bedGraph file; contains genomic intervals and corresponding scores, e.g. average read numbers per 50 bp"
+[bedGraph]: https://docs.google.com/document/d/1Iv9QnuRYWCtV_UCi4xoXxEfmSZYQNyYJPNsFHnvv9C0/edit?usp=sharing "text file that contains genomic intervals and corresponding scores, e.g. average read numbers per 50 bp"
+[FASTQ]: https://docs.google.com/document/d/1Iv9QnuRYWCtV_UCi4xoXxEfmSZYQNyYJPNsFHnvv9C0/edit?usp=sharing "text file of raw reads (almost straight out of the sequencer)"
+
+[bamCorrelate]: https://github.com/fidelram/deepTools/wiki/QC#wiki-bamCorrelate
+[bamFingerprint]: https://github.com/fidelram/deepTools/wiki/QC#wiki-bamFingerprint
+[computeGCBias]: https://github.com/fidelram/deepTools/wiki/QC#wiki-computeGCbias
+[bamCoverage]: https://github.com/fidelram/deepTools/wiki/Normalizations#wiki-bamCoverage
+[bamCompare]: https://github.com/fidelram/deepTools/wiki/Normalizations#wiki-bamCompare
+[computeMatrix]: https://github.com/fidelram/deepTools/wiki/Visualizations
+[heatmapper]: https://github.com/fidelram/deepTools/wiki/Visualizations
+[profiler]: https://github.com/fidelram/deepTools/wiki/Visualizations
+
+[Benjamini and Speed]: http://nar.oxfordjournals.org/content/40/10/e72 "Nucleic Acids Research (2012)"
+[Diaz et al.]: http://www.degruyter.com/view/j/sagmb.2012.11.issue-3/1544-6115.1750/1544-6115.1750.xml "Stat. Appl. Gen. Mol. Biol. (2012)"
+[Anaconda Scientific Python Distribution]: https://store.continuum.io/cshop/anaconda/
+
+This tool suite is developed by the [Bioinformatics Facility](http://www1.ie-freiburg.mpg.de/bioinformaticsfac) at the [Max Planck Institute for Immunobiology and Epigenetics, Freiburg](http://www1.ie-freiburg.mpg.de/).
+
+[Wiki Start Page](https://github.com/fidelram/deepTools/wiki) | [deepTools Galaxy](http://deeptools.ie-freiburg.mpg.de) | [FAQ](https://github.com/fidelram/deepTools/wiki/FAQ)
diff --git a/bin/bamCompare b/bin/bamCompare
@@ -25,9 +25,9 @@ def parseArguments(args=None):
         parents=[parentParser, bamParser, outputParser],
         formatter_class=argparse.ArgumentDefaultsHelpFormatter,
         description='This tool compares two BAM files based on the number of '
-        'mapped reads. To compare the BAM files the genome is partitioned '
+        'mapped reads. To compare the BAM files, the genome is partitioned '
         'into bins of equal size, then the number of reads found in each BAM '
-        'file are counted for such bins and finally a summarizing value is '
+        'file is counted for such bins and finally a summarizing value is '
         'reported. This vaule can be the ratio of the number of reads per '
         'bin, the log2 of the ratio or the difference. This tool can '
         'normalize the number of reads on each BAM file using the SES method '
@@ -36,7 +36,7 @@ def parseArguments(args=None):
         'and molecular biology, 11(3). Normalization based on read counts '
         'is also available. The output is either a bedgraph or a bigwig file '
         'containing the bin location and the resulting comparison values. By '
-        'default if reads are mated the fragment length reported in the BAM '
+        'default, if reads are mated, the fragment length reported in the BAM '
         'file is used.')
 
     # define the arguments
@@ -60,7 +60,7 @@ def parseArguments(args=None):
 
     parser.add_argument('--bamIndex2', '-bai2',
                         help='Index for the bam file1. Default is to consider '
-                        'a the path of the bam file adding the .bai suffix.',
+                        'the path of the bam file adding the .bai suffix.',
                         metavar='bam file index')
 
     parser.add_argument('--scaleFactorsMethod',
@@ -92,7 +92,7 @@ def parseArguments(args=None):
                         required=False)
 
     parser.add_argument('--ratio',
-                        help='The default output the log2ratio between the '
+                        help='The default is to output the log2ratio between the '
                         'two samples. The reciprocal ratio returns the '
                         'the negative of the inverse of the ratio '
                         'if the ratio is less than 0. The resulting '
@@ -103,14 +103,14 @@ def parseArguments(args=None):
                         required=False)
 
     parser.add_argument('--normalizeTo1x',
-                        help='only when --ratio subtract Report normalized '
-                        'coverage to 1x sequencing depth. Sequencing dept is '
+                        help='(only when --ratio subtract) Report normalized '
+                        'coverage to 1x sequencing depth. Sequencing depth is '
                         'defined as the total number of mapped reads*fragment '
                         'length / effective genome size. To use this option, '
                         'the effective genome size has to be given. Common '
                         'values are: mm9: 2150570000, hg19:2451960000, '
                         'dm3:121400000 and ce10:93260000. The default is '
-                        'not to use any normalization. ',
+                        'not to use any normalization.',
                         default=None,
                         type=int,
                         required=False)
@@ -120,7 +120,7 @@ def parseArguments(args=None):
                         'normalize the number of reads per bin. The formula '
                         'is: RPKM (per bin)=#reads per bin / ( # of mapped '
                         'reads (millions) * bin length (KB) ). This is the '
-                        'defalt normalization method.',
+                        'default normalization method.',
                         action='store_true',
                         required=False)
 
@@ -144,7 +144,7 @@ def parseArguments(args=None):
                         help='A list of chromosome names '
                         'separated by comma and limited by quotes, '
                         'containing those '
-                        'chromosomes that want to be excluded '
+                        'chromosomes that you want to be excluded '
                         'for computing the normalization. For example, '
                         ' --ignoreForNormalization "chrX, chrM" ')
 
@@ -173,8 +173,7 @@ def main(args):
     """
     The algorithm is composed of two parts.
 
-    1. Using the SES or mapped reads method.
-       Appropiate scaling factors are determined.
+    1. Using the SES or mapped reads method, appropiate scaling factors are determined.
 
     2. The genome is transversed, scaling the BAM files, and computing
        the log ratio/ratio/difference for bins of fixed width

diff --git a/bin/bamCorrelate b/bin/bamCorrelate
@@ -232,7 +232,7 @@ def plotCorrelation(corr_matrix, labels, plotFileName, vmax=None,
     import scipy.cluster.hierarchy as sch
     M = corr_matrix.shape[0]
 
-    # set the minumun and maximun values
+    # set the minimum and maximum values
     if vmax is None:
         vmax = 1
     if vmin is None:

diff --git a/bin/correctGCBias b/bin/correctGCBias
@@ -107,7 +107,7 @@ def getRequiredArgs():
     output = parser.add_argument_group('Output options')
     output.add_argument('--correctedFile', '-o',
                         help='Name of the corrected file. The ending will '
-                        'be used to decide the ouput file format. The options '
+                        'be used to decide the output file format. The options '
                         'are ".bam", ".bw" for a bigWig file, ".bg" for a '
                         'bedgraph file.',
                         metavar='FILE',

diff --git a/bin/estimateScaleFactor b/bin/estimateScaleFactor
@@ -78,7 +78,7 @@ def parseArguments(args=None):
 
     parser.add_argument('--numberOfProcessors', '-p',
                         help='Number of processors to use. The default is '
-                        'to use half the maximun number of processors.',
+                        'to use half the maximum number of processors.',
                         metavar="INT",
                         type=numberOfProcessors,
                         default="max/2",

diff --git a/deeptools/SES_scaleFactor.py b/deeptools/SES_scaleFactor.py
@@ -89,7 +89,7 @@ def estimateScaleFactor(bamFilesList, binLength, numberOfSamples,
     p = np.sort(num_reads_per_bin[0, :]).cumsum()
     q = np.sort(num_reads_per_bin[1, :]).cumsum()
 
-    # p[-1] and q[-1] are the maximun values in the  arrays.
+    # p[-1] and q[-1] are the maximum values in the  arrays.
     # both p and q are normalized by this value
     diff = np.abs(p / p[-1] - q / q[-1])
     # get the lowest rank for wich the difference is the maximum

diff --git a/deeptools/heatmapper.py b/deeptools/heatmapper.py
@@ -422,7 +422,7 @@ def coverageFromBigWig(bigwig, chrom, zones, binSize, avgType,
         """
         uses bigwig file reader from bx-python
         to query a region define by chrom and zones.
-        The ouput is an array that contains the bigwig
+        The output is an array that contains the bigwig
         value per base pair. The summary over bins is
         done in a later step when coverageFromArray is called.
         This method is more reliable than quering the bins

diff --git a/deeptools/parserCommon.py b/deeptools/parserCommon.py
@@ -101,7 +101,7 @@ def getParentArgParse(args=None, binSize=True):
 
     if binSize:
         optional.add_argument('--binSize', '-bs',
-                              help='Size of the bins in bp for the ouput '
+                              help='Size of the bins in bp for the output '
                               'of the bigwig/bedgraph file.',
                               metavar="INT bp",
                               type=int,
@@ -119,7 +119,7 @@ def getParentArgParse(args=None, binSize=True):
 
     optional.add_argument('--numberOfProcessors', '-p',
                           help='Number of processors to use. Type "max/2" to '
-                          'use half the maximun number of processors or "max" '
+                          'use half the maximum number of processors or "max" '
                           'to use all available processors.',
                           metavar="INT",
                           type=numberOfProcessors,
@@ -322,15 +322,13 @@ def computeMatrixOptArgs(case=['scale-regions', 'reference-point'][0]):
                               type=int,
                               metavar='INT bp',
                               help='Distance upstream of the reference-point '
-                              'selected.',
-                              required=True)
+                              'selected.')
         optional.add_argument('--afterRegionStartLength', '-a', '--downstream',
                               default=1500,
                               metavar='INT bp',
                               type=int,
                               help='Distance downstream of the '
-                              'reference-point selected.',
-                              required=True)
+                              'reference-point selected.')
         optional.add_argument('--nanAfterEnd',
                               action='store_true',
                               help='If set, any values after the region end '
@@ -420,7 +418,7 @@ def computeMatrixOptArgs(case=['scale-regions', 'reference-point'][0]):
                           default=1)
     optional.add_argument('--numberOfProcessors', '-p',
                           help='Number of processors to use. Type "max/2" to '
-                          'use half the maximun number of processors or "max" '
+                          'use half the maximum number of processors or "max" '
                           'to use all available processors.',
                           metavar="INT",
                           type=numberOfProcessors,

diff --git a/examples/Gal_FAQ_IGV_dataset.png b/examples/Gal_FAQ_IGV_dataset.png
diff --git a/examples/heatmaps_kmeans_Pol_II_small.png b/examples/heatmaps_kmeans_Pol_II_small.png
diff --git a/galaxy/bigwigCompare.xml b/galaxy/bigwigCompare.xml
@@ -62,7 +62,7 @@
             <when value="yes">
                 <param name="binSize" type="integer" value="50" min="1" 
                     label="Bin size in bp"
-                    help="Size of the bins in bp for the ouput of the bigwig/bedgraph file "/>
+                    help="Size of the bins in bp for the output of the bigwig/bedgraph file "/>
 
                 <param name="missingDataAsZero" type="boolean" truevalue="yes" falsevalue="no" checked="True"
                     label ="Treat missing data as zero"