Skip to content

Latest commit

 

History

History
143 lines (90 loc) · 4.52 KB

README.rst

File metadata and controls

143 lines (90 loc) · 4.52 KB

MAmotif

pypi license

Introduction

MAmotif is used to compare two ChIP-seq samples of the same protein from different cell types (or conditions, e.g. wild-type vs mutant) and identify transcriptional factors (TFs) associated with the cell type-biased binding of this protein as its co-factors, by using TF binding information obtained from motif analysis (or from other ChIP-seq data). MAmotif automatically combines MAnorm model to perform quantitative comparison on input ChIP-seq samples together with Motif-Scan toolkit to scan ChIP-seq peaks for TF binding motifs, and uses a systematic integrative analysis to search for TFs whose binding sites are significantly associated with the cell type-biased peaks between two ChIP-seq samples. When applying to ChIP-seq data of histone marks of regulatory elements (such as H3K4me3 for active promoters and H3K9/27ac for active promoters and enhancers), or DNase/ATAC-seq data, MAmotif can be used to detect cell type-specific regulators .

Documentation

To see the full documentation of MAmoitf, please refer to: http://bioinfo.sibs.ac.cn/shaolab/mamotif/index.php

Workflow

https://github.com/shao-lab/MAmotif/blob/master/image/MAmotif_workflow.png

Installation

The latest version release of MAmotif is available at PyPI:

$ pip install mamotif

MAmoitf uses setuptools for installation from source code. The source code of MAmoitf is hosted on GitHub: https://github.com/shao-lab/MAmotif

You can clone the repo and execute the following command under source directory:

$ python setup.py install

Usage

Build genomes

Before you use MAmotif, you need to build the prerequisites for corresponding genome assembly.

$ genomecompile [-h] [-v] -G sequences.fa -o output_dir

A directory contaning compiled genome sequence and information would be generated by this command.

Note: You only need run it once for each genome.

Build motif PWM (Optional)

Note: MAmoitf provides some preprocessed motif PWM files under data/motif of the MotifScan package.

IF you have some motifs that have not be included in our pre-complied motif collection, you need to compile on your own by using the following command.

$ motifcompile –M motif_pwm_demo.txt –g hg19_for_motifscan

-M motif raw matrix file

-g a pre-compiled genome directory generated by genomecompile

Motif raw matrix file should follow the format as below:

motif id and motif name are followed by a positive weighted matrix, and columns are seperated by tabs.

>MA0599.1 KLF5
1429 0 0 3477 0 5051 0 0 0 3915
2023 11900 12008 9569 13611 0 13611 13611 13135 5595
7572 0 0 0 0 5182 0 0 0 0
2587 1711 1603 565 0 3378 0 0 476 4101

run MAmotif

$ mamoitf --p1 sample1_peaks.bed --p2 sample2_peaks.bed --r1 sample1_reads.bed --r2 sample2_reads.bed -g hg19_for_motifscan –m motif_pwm_demo.txt -o sample1_vs_sample2

Note: Using -h/--help for the details of all arguments.

Output of MAmotif

After finished running MAmotif, all output files will be written to the directory you specified with "-o" argument.

The main output file will include the following fields:

1.Motif Name
2.Target Number: Number of peaks with motif targets
3.Average of Target M-value
4.Deviation of Target M-value
5.Non-target Number: Number of peaks without motif targets
6.Average of Non-target M-value
7.Deviation of Non-target M-value
8.T-test Statistics: T-Statistics for M-values of (peaks with motif targets) against (peaks without motif targets)
9.T-test P-value(right-tail)
10.T-test P-value By Benjamin correction
11.RanSum-test Statistics
12.RankSum-test P-value(right-tail)
13.RankSum-test P-value By Benjamin correction
14.Maximal P-value: Maximal corrected P-value of T-test and RankSum test

MAmotif will also output tables to summarize the motif targets number and motif score of each peak region.

If you specified "-s" with MAmotif, it will also output the genome coordinates of every motif targets.

License

BSD 3-Clause License