Work in progress!
EpiJinn is a Python package for working with modified (methylated) nucleotides.
- Create a readable report from bedMethyl files created by modkit.
- Annotate prokaryotic DNA methylase enzyme recognition sites in a Biopython SeqRecord.
- Check whether recognition sites of prokaryotic DNA methylase enzymes overlap with a recognition site of a restriction enzyme, in a DNA sequence. Methylation of restriction site nucleotides blocks recognition/restriction and thus DNA assembly.
The software is geared towards working with plasmids. Several more future functionalities are planned: comparing methylation status with expected methylation levels; methylase site recognition etc.
pip install git+https://github.com/Edinburgh-Genome-Foundry/EpiJinn.git
See additional install instructions for the PDF Reports dependency, and its Weasyprint dependency.
import epijinn
bedmethylitemgroup = epijinn.read_sample_sheet(
sample_sheet="sample_sheet.csv",
genbank_dir='genbank',
bedmethyl_dir='bedmethyl',
parameter_sheet='param_sheet.csv',)
bedmethylitemgroup.perform_all_analysis_in_bedmethylitemgroup()
epijinn.write_bedmethylitemgroup_report(bedmethylitemgroup=bedmethylitemgroup, pdf_file="report.pdf", html_file="report.html")
Both pdf_file
and html_file
are optional, specify None
to exclude either of them.
An example sample sheet and parameter sheet is included in the examples
directory.
Note that multiple methylase enzymes (separated by space) can be specified in the parameter sheet.
The module contains the Methylator
class for storing a sequence, methylation enzymes and a restriction enzyme recognition site. It has a method for finding overlaps, and uses DNA Chisel to find sequence matches.
An example overlap:
...ccgcatgaagggcgcgccaggtctcaccctgaattcgcg...
ggtctc : BsaI restriction site
CCAGGTCTCACC : Match in positive strand
CCWGG : Dcm methylation site
* : methylated cytosine
* : methylated cytosine (on other strand)
For information on the effect of DNA methylation on each enzyme, see the Restriction Enzyme Database.
import epijinn
methylator = epijinn.Methylator(sequence=str(sequence), site=site_BsaI)
methyl.find_methylation_sites_in_pattern()
import epijinn
import Bio
sequence = 'ATGTCCCCATGCCTAC' + 'AGCAAGGC' + 'CGTCTC' + 'A' + 'GGCCCCCCCCCCCCA' # seq + EcoBI (+ BsmBI +) EcoBI + seq
rest_dict = Bio.Restriction.Restriction_Dictionary.rest_dict
site_BsmBI = rest_dict['BsmBI']['site']
epijinn.EcoBI.sequence
# 'TGANNNNNNNNTGCT'
methylator = epijinn.Methylator(sequence, site=site_BsmBI)
methylator.find_methylation_sites_in_pattern()
print(methylator.report)
Result:
Matches against methylase enzyme sites:
EcoKDam
=======
Region: 22-32(+)
Positive strand: -
Negative strand: -
EcoKDcm
=======
Region: 21-33(+)
Positive strand: -
Negative strand: -
EcoBI
=====
Region: 13-42(+)
Positive strand: -
Match in negative strand: TACAGCAAATCCGTCTCAGGCCCCCCCCC
EcoKI
=====
Region: 14-41(+)
Positive strand: -
Negative strand: -
The same approach can be used for finding enzyme site overlaps with other epigenetic modifications. For example, in DNA phosphorothioation, an oxygen on the DNA backbone is replaced with sulfur.
thio = epijinn.Methylator(sequence, site=site_BsmBI, methylases=epijinn.DND)
thio.find_methylation_sites_in_pattern()
This returns an overlap with a putative dnd target site of Streptomyces lividans 1326 with conserved sequence GGCC:
Dnd_Sli1326
===========
Region: 21-33(+)
Match in positive strand: GGCCGTCTCAGG
Match in negative strand: GGCCGTCTCAGG
EpiJinn uses the semantic versioning scheme.
Copyright 2024 Edinburgh Genome Foundry, University of Edinburgh
EpiJinn was written at the Edinburgh Genome Foundry by Peter Vegh, and is released under the GPLv3 license.