A Python Workflow for the Generation and Analysis of Protein-Ligand Interaction Fingerprints from Molecular Dynamics trajectories
-
IFP analysis of dissociation trajectories for 3 compounds of HSP90 reported in the paper
D. B. Kokh, B. Doser, S. Richter, F. Ormersbach, X. Cheng , R.C. Wade "A Workflow for Exploring Ligand Dissociation from a Macromolecule: Efficient Random Acceleration Molecular Dynamics Simulation and Interaction Fingerprints Analysis of Ligand Trajectories" J. Chem Phys.(2020) 158 125102 doi: 10.1063/5.0019088; https://arxiv.org/abs/2006.11066
was implemented in IFP_generation_examples_Analysis.ipynb
https://zenodo.org/record/3981155#.XzQEUCgzaUk
- Youtube lecture/tutorial for 2020 MolSSI School on Open Source Software in Rare Event Path Sampling Strategies: "tauRAMD workflow: fast estimation of protein-ligand residence times with insights into dissociation mechanisms" : https://www.youtube.com/watch?v=kCUyQtoo4cE&feature=youtu.be
- Daria Kokh
- Fabian Ormersbach - preprocessing PDB files using Chimera (Process_pdb.py, chimera_hydrogen_mol2.py; test examples revised)
Heidelberg Institute of Theoretical Studies (HITS, www.h-its.org)
Schloss-Wolfsbrunnenweg 35
69118 Heidelberg, Germany
This open source software code was developed in part in the Human Brain Project, funded from the European Union’s Horizon 2020 Framework Programme for Research and Innovation under Specific Grant Agreements No. 785907 (Human Brain Project SGA2).
Python 3.x
Python Libraries:
- numpy; pandas; matplotlib; seaborn; sklearn; scipy;
- RDkit
- ngview - used for visualization (installation of ngview can be tricky, the following way may work: after installation of the Python envirenment - conda install -c conda-forge nglview=2.7.1 and then jupyter-nbextension enable nglview --py --sys-prefix). If you don't need visualization, you can skip this, but JN must be edited accordingly
- MDAnalysis Version: 0.20.1 (Important: an old module for H-bond analysis is currently used, it will be removed in version 2.0 )
Chimera - only for the scripts used for preprocessing pdb files (structure protonation and generation of the ligand mol2 file); not required if protonation and mol2 file are already prepared by a user)
Codes were written on Python 3.x and tested on Python 3.7
To configure environment in anaconda use: conda env create -f MD-IFP.yml
Trajectories.py - functions for building a trajectory object for reading and analysis of standard MD and RAMD trajectories and computation of relative residence times
IFP_generation.py - functions for generation of IFPs
Clustering.py - functions for analysis of trajectories using IFP data (is still under developments)
Process_pdb.py - preprocessing PDB files (splitting into ligand and protein files)
chimera_hydrogen_mol2.py - generation of ligand mol2 file
- IFP.py - Generation of the IFP databease for a single MD trajectory of a protein-ligand complex
- IFP_contacts_quickView.py - generation of a plot with average IFPs extracted from a trajectory
can be downloaded from https://zenodo.org/record/3981155#.XzQEUCgzaUk
Protein-Ligand Interaction Fingerprint (IFP) computations (only functions of IFP_generation.py are used) for:
- a single structure prepared for MD simulations (HSP90; PDB ID 6EI5, dcd format)
- a trajectory (for selected frames; dcd format)
- a PDB structure
Generation and analysis of IFPs for conventional MD simulations and for RAMD trajectories for Muscarinic Receptor M2 in a membrane. In this example, Trajectories.py is used for pre-processing trajectories and IFP_generation.py is used for computing IFPs
- Computing IFPs for a single equilibration trajectory (dcd format)
- Computing IFPs for a set of trajectories: system equilibration and ligand dissociation (RAMD) trajectories (dcd format) Illustration of PL IFP variation in one of the dissociation trajectories of iperoxo bound to muscarinic receptor M2 .
This example shows how RAMD dissociation trajectories can be analyzed using pre-generated IFP databases
This plot illustrates ligand dissociation pathways in a graph representation derived from clustering ligand trajectories in IFP space and plotting them with respect to the ligand COM from the initial bound position.
JN designed for validation of the IFP sctipt on 40 PDB complexes (used in paper J. Chem. Phys. 2020)