From e2d3d8f157da1ffc215e29269d50aa8a80abfe87 Mon Sep 17 00:00:00 2001 From: haydensun Date: Wed, 28 Mar 2018 17:25:40 +0800 Subject: [PATCH] Update README --- README.rst | 121 ++++++++++++++++++++++++++++------------------------- 1 file changed, 65 insertions(+), 56 deletions(-) diff --git a/README.rst b/README.rst index 684e9bf..bbd303a 100644 --- a/README.rst +++ b/README.rst @@ -14,127 +14,136 @@ MAmotif Introduction ------------ -MAmotif is used to compare two ChIP-seq samples of the same protein from different cell types -(or conditions, e.g. wild-type vs mutant) and identify transcriptional factors (TFs) associated -with the cell type-biased binding of this protein as its co-factors, by using TF binding information obtained from -motif analysis (or from other ChIP-seq data). MAmotif automatically combines MAnorm model to perform quantitative -comparison on input ChIP-seq samples together with Motif-Scan toolkit to scan ChIP-seq peaks for TF binding motifs, -and uses a systematic integrative analysis to search for TFs whose binding sites are significantly associated with -the cell type-biased peaks between two ChIP-seq samples. When applying to ChIP-seq data of histone marks of -regulatory elements (such as H3K4me3 for active promoters and H3K9/27ac for active promoters and enhancers), -or DNase/ATAC-seq data, MAmotif can be used to detect cell type-specific regulators . +**MAmotif** is used to compare two ChIP-seq samples of the same protein from different cell types or conditions +(e.g. Mutant vs Wild-type) and **identify transcriptional factors (TFs) associated with the cell-type biased binding** +of this protein as its **co-factors**, by using TF binding information obtained from motif analysis +(or from other ChIP-seq data). -Documentation -------------- +MAmotif automatically combines **MAnorm** model to perform quantitative comparison on given ChIP-seq samples together +with Motif-Scan toolkit to scan ChIP-seq peaks for **TF binding motifs**, and uses a systematic integrative analysis to +search for TFs whose binding sites are significantly associated with the cell-type biased peaks between two ChIP-seq samples. -To see the full documentation of MAmoitf, please refer to: http://bioinfo.sibs.ac.cn/shaolab/mamotif/index.php +When applying to ChIP-seq data of histone marks of regulatory elements (such as H3K4me3 for active promoters and +H3K9/27ac for active promoter/enhancers), or DNase/ATAC-seq data, MAmotif can be used to detect **cell-type specific regulators**. Workflow -------- .. image:: https://github.com/shao-lab/MAmotif/blob/master/docs/source/image/MAmotif_workflow.png +Documentation +------------- + +To see the full documentation of MAmoitf, please refer to: http://bioinfo.sibs.ac.cn/shaolab/mamotif/index.php + Installation ------------ -The latest version release of MAmotif is available at -`PyPI `__: +The latest release of MAmotif is available at `PyPI `__: :: $ pip install mamotif -MAmoitf uses `setuptools `__ for installation from source code. -The source code of MAmoitf is hosted on GitHub: https://github.com/shao-lab/MAmotif +Or you can install MAmotif via conda: -You can clone the repo and execute the following command under source directory: +**WIP!** :: - $ python setup.py install + $ conda install -c bioconda mamotif -Usage ------ - -Build genomes -^^^^^^^^^^^^^ +MAmotif uses `setuptools `__ for installation from source code. +The source code of MAmotif is hosted on GitHub: https://github.com/shao-lab/MAmotif -Before you use MAmotif, you need to build the prerequisites for corresponding genome assembly. +You can clone the repo and execute the following command under source directory: :: - $ genomecompile [-h] [-v] -G sequences.fa -o output_dir + $ python setup.py install -A directory contaning compiled genome sequence and information would be generated by this command. +Galaxy Installation +------------------- -**Note:** You only need run it once for each genome. +**WIP!** -Build motif PWM (Optional) -^^^^^^^^^^^^^^^^^^^^^^^^^^ -**Note:** MAmoitf provides some preprocessed motif PWM files under data/motif of the MotifScan package. +Usage +----- + +You need to build some prerequisites before running MAmotif: + +Build genomes +^^^^^^^^^^^^^ -IF you have some motifs that have not be included in our pre-complied motif collection, you need to compile on your own by using the following command. +Preprocess sequences and genome-wide nucleotide frequency for the corresponding genome assembly. :: - $ motifcompile –M motif_pwm_demo.txt –g hg19_for_motifscan + $ genomecompile [-h] [-v] -G hg19.fa -o hg19_genome --M motif raw matrix file +**Note:** You only need to run this command once for each genome --g a pre-compiled genome directory generated by genomecompile +Build motifs (Optional) +^^^^^^^^^^^^^^^^^^^^^^^ -Motif raw matrix file should follow the format as below: +**Note:** MAmotif provides some preprocessed motif PWM files under **data/motif** of the MotifScan package. -motif id and motif name are followed by a positive weighted matrix, and columns are seperated by tabs. +Build motif PWM/motif-score cutoff for custom motifs that are not included in our pre-complied motif collection: :: - >MA0599.1 KLF5 - 1429 0 0 3477 0 5051 0 0 0 3915 - 2023 11900 12008 9569 13611 0 13611 13611 13135 5595 - 7572 0 0 0 0 5182 0 0 0 0 - 2587 1711 1603 565 0 3378 0 0 476 4101 + $ motifcompile [-h] [-v] –M motif_pwm_demo.txt –g hg19_genome -o hg19_motif run MAmotif ^^^^^^^^^^^ :: - $ mamoitf --p1 sample1_peaks.bed --p2 sample2_peaks.bed --r1 sample1_reads.bed --r2 sample2_reads.bed -g hg19_for_motifscan –m motif_pwm_demo.txt -o sample1_vs_sample2 + $ mamotif --p1 sample1_peaks.bed --p2 sample2_peaks.bed --r1 sample1_reads.bed --r2 sample2_reads.bed -g hg19_genome + –m hg19_motif_p1e-4.txt -o sample1_vs_sample2 **Note:** Using -h/--help for the details of all arguments. - Output of MAmotif ----------------- After finished running MAmotif, all output files will be written to the directory you specified with "-o" argument. -The main output file will include the following fields: +Main output +^^^^^^^^^^^ :: 1.Motif Name - 2.Target Number: Number of peaks with motif targets - 3.Average of Target M-value - 4.Deviation of Target M-value - 5.Non-target Number: Number of peaks without motif targets - 6.Average of Non-target M-value - 7.Deviation of Non-target M-value - 8.T-test Statistics: T-Statistics for M-values of (peaks with motif targets) against (peaks without motif targets) - 9.T-test P-value(right-tail) + 2.Target Number: Number of motif-present peaks + 3.Average of Target M-value: Average M-value of motif-present peaks + 4.Deviation of Target M-value: M-value Std of motif-present peaks + 5.Non-target Number: Number of motif-absent peaks + 6.Average of Non-target M-value: Average M-value of motif-absent peaks + 7.Deviation of Non-target M-value: M-value Std of motif-absent peaks + 8.T-test Statistics: T-Statistics for M-values of motif-present peaks against motif-absent peaks + 9.T-test P-value: Right-tailed P-value of T-test 10.T-test P-value By Benjamin correction 11.RanSum-test Statistics - 12.RankSum-test P-value(right-tail) + 12.RankSum-test P-value 13.RankSum-test P-value By Benjamin correction - 14.Maximal P-value: Maximal corrected P-value of T-test and RankSum test + 14.Maximal P-value: Maximal corrected P-value of T-test and RankSum-test + +MAnorm output +^^^^^^^^^^^^^ + +MAmotif will invoke MAnorm and output the normalization results and MA-plot for samples under comparison. + +Motif output +^^^^^^^^^^^^ -MAmotif will also output tables to summarize the motif targets number and motif score of each peak region. +MAmotif will also output tables to summarize the enrichment of motifs and the motif target number and motif-score +of each peak region. -If you specified "-s" with MAmotif, it will also output the genome coordinates of every motif targets. +If you specified "-s" with MAmotif, it will also output the genome coordinates of every motif target site. License