Dynamic Landscape of Human L1 Transposition Revealed with Functional Data Analysis.

This resource provides the perl code used in paper below

"Dynamic Landscape of Human L1 Transposition Revealed with Functional Data Analysis". (2020) Molecular Biology and Evolution 37 (12), 3576-3600. D Chen, MA Cremona, Z Qi et al.
URL: https://pubmed.ncbi.nlm.nih.gov/32722770/

Disclaimer

The pipeline has been tailored for the Washington University HTCF computing environment (https://htcf.wustl.edu/docs/) which uses the slurm queueing system (https://slurm.schedmd.com/tutorials.html). No guarantees are made about other systems, setups, configurations, etc.

Main functions

The main functions are

1. Align the LINE1 (L1) reads to the human genome to idenfiy the integration sites.
1. Cluster the L1 integration sites by a custom window size (default: 500 bp).
1. de-novo motif discovery of the L1 integration sites to identify potential binding proteins.
1. Motif enrichment anlaysis of specified transcription factor(s).

Instructions

These scripts are wrapped by a master perl script. Download all the scripts in one folder and run the warpper perl script 'L1_bar_wrapperV3.pl'. Note: the prefix of the output name is from the first "-"(hyphen) deliminated input name.

Usage: perl L1_bar_wrapperV3.pl <read1.fq> <read2.fq> <barcode> <genome_aligner> <TF_to_scan>

For <genome_aligner>: options can be combined without spaces in between (i.e. 12, 123 or 1234); Genome database is hg19.

bowtie2 for read2 single end alingment
bowtie2 for paired end alignment
novoalign for read2 single end alignment
novoalign for paired end alignment

For <TF_to_scan>: underscore delimited transcription factors to scan;

Peak Visualization

plot genome-wide distribution of features by home-made R plot functions (cyto_plotV2.R)
simplified cytoband.txt (cytoband_hg19_2), which doesn't have grey regions in chr plot.
genomeplot.pdf

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
L1CCWT.sh		L1CCWT.sh
L1_bar_fq_read_analyzerV2.pl		L1_bar_fq_read_analyzerV2.pl
L1_bar_wrapperV3.pl		L1_bar_wrapperV3.pl
README.md		README.md
bar_bedV1.pl		bar_bedV1.pl
cluster_merge_uniq_bar_bedV2.pl		cluster_merge_uniq_bar_bedV2.pl
cluster_uniqID_bar_bedV1.pl		cluster_uniqID_bar_bedV1.pl
cyto_plotV2.R		cyto_plotV2.R
cytoband_hg19.txt		cytoband_hg19.txt
fa_seq_fetcherV1.pl		fa_seq_fetcherV1.pl
fq_read_fetcher_pairV3.pl		fq_read_fetcher_pairV3.pl
merge_uniq_bar_bedV2.pl		merge_uniq_bar_bedV2.pl
pwm_scannerV1.pl		pwm_scannerV1.pl
saturation_bar_bedV1.pl		saturation_bar_bedV1.pl
uniqID_uniqPOS_bar_bedV1.pl		uniqID_uniqPOS_bar_bedV1.pl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Dynamic Landscape of Human L1 Transposition Revealed with Functional Data Analysis.

Disclaimer

Main functions

Instructions

Peak Visualization

About

Releases

Packages

Languages

qizongtai/LINE1_gemomic_alignment_and_clustering

Folders and files

Latest commit

History

Repository files navigation

Dynamic Landscape of Human L1 Transposition Revealed with Functional Data Analysis.

Disclaimer

Main functions

Instructions

Peak Visualization

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages