GitHub

TFNet

Aim to decipher the interactions of transcriptional factors (TFs) bewteen co-binding and rejection by deep learning model.

TFNet integrate two core module 'TF and DNA interaction' and 'TFs interaction', take account of Chromatin accessibility and DNA sequence information to learn the pattern of TFs binding. In addition, the common used method - motif scan were applied to infer the motif-dependent TFs for further analysis

Workflow

Genomic sequence data

# Genome size of each chromosome, may need to filter some chrom in the hg38.chrom.sizes.reduced
samtools faidx data/genome/genome.fa
cut -f 1,2 data/genome/genome.fa.fai > data/hg38.chrom.sizes.reduced

Genomic sequence is provided as fasta format. You can download these files from here blacklist can be downloaded from here

for preprocessing Chromatin accessibility data

# If you have replicates for the same cell type, you should first merge them with samtools
samtools merge -o merge.bam rep1.bam rep2.bam
samtools index merge.bam
# run the bamCoverage in deeptools to generate the required .bw file 
bamCoverage --bam merge.bam -o merge.bw -p 4 --binSize 1 --normalizeUsing RPGC --effectiveGenomeSize 2864785220 --ignoreForNormalization chrX chrM

effectiveGenomeSize can be found here

for preprocessing chip-seq data

# need .bed file and chip.txt contain the file and name of all tfs
ls data/tf_chip/*bed.gz | awk -F '\_|\.' '{OFS="\t"}{print $0,$2}' > data/tf_chip/chip.txt
python preprocess_data.py -d configure/data.yaml -m configure/tfnet.yaml

adjust model config file to select Model structure and set classweights

for TFNet Training and Testing

python main.py -d configure/data.yaml -m configure/tfnet.yaml --mode train -n 5
python main.py -d configure/data.yaml -m configure/tfnet.yaml --mode eval -n 5 # evaluate on test set
python main.py -d configure/data.yaml -m configure/tfnet.yaml --mode predict -n 5 # predict on independent data set
python main.py -d configure/data.yaml -m configure/tfnet.yaml --mode 5cv # 5 cross-validation
python main.py -d configure/data.yaml -m configure/tfnet.yaml --mode lomo # leave one data out cross-validation

# continue to train
python main.py -d configure/data.yaml -m configure/tfnet.yaml --mode train -n 5 -c

# split large training dataset for training #
# split 

# train_prefix in data.yaml
gunzip -c data_train.txt.gz
split -l n -d -a 2 data_train.txt data_train_mini_ # where n is the number of lines in each file, 230000 
gzip data_train_mini_*
# train
python main_split.py -d configure/data.yaml -m configure/tfnet.yaml --mode train -n 5

for TFNet Eval

using test and bigwig_file list in data.yaml and eval_list mode, for batch eval on different datasets
test_list: [ 'data/tf_chip/H1_test/data_train_mini2_an.gz', 'data/tf_chip/H1_test/data_train_mini2_ao.gz']
bigwig_file_list: [['./data/atac/grch38/H1.bigWig'], ['./data/atac/grch38/H1.bigWig']]

python main.py -d configure/data.yaml -m configure/tfnet3.yaml --mode eval_list -n 1

Name		Name	Last commit message	Last commit date
Latest commit History 154 Commits
.vscode		.vscode
archive		archive
configure		configure
data		data
img		img
preprocess		preprocess
scripts		scripts
tfnet		tfnet
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
main.py		main.py
main_epoch.py		main_epoch.py
main_epoch_dnashape.py		main_epoch_dnashape.py
main_split.py		main_split.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TFNet

Workflow

Genomic sequence data

for preprocessing Chromatin accessibility data

for preprocessing chip-seq data

adjust model config file to select Model structure and set classweights

for TFNet Training and Testing

for TFNet Eval

About

Releases

Packages

Languages

cmf1997/tfnet

Folders and files

Latest commit

History

Repository files navigation

TFNet

Workflow

Genomic sequence data

for preprocessing Chromatin accessibility data

for preprocessing chip-seq data

adjust model config file to select Model structure and set classweights

for TFNet Training and Testing

for TFNet Eval

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages