DMFF project aims to implement organic molecular force fields using a differentiable programming framework, such that derivatives with respect to atomic positions, box shape, and force field parameters can be easily computed. It contains different modules, dealing with different types of force field terms. Currently, there are two primary modules:
-
ADMP (Automatic Differentiable Multipolar Polarizable Potential) module ADMP mainly deals with multipolar polarizable models. Its core function is very similar to the MPID plugin in OpenMM, implementing PME calculators for multipolar polarizable electrostatic interactions and long-range dispersion interactions (with the shape of
$c_i c_j/r^p$ ). It also devises a user-defined real-space pairwise interaction calculator based on cutoff scheme. -
Classical module The classical module implements conventional (AMBER and OPLS like) force fields. For long-range interactions, it invokes the ADMP PME kernel, but wrapps it in a more "classical" way. It also incoporates the classical intramolecular terms: bonds, angles, proper and improper dihedrals etc.
All interations involved in DMFF are briefly introduced below and the users are encouraged to read the references for more mathematical details:
The electrostatic interaction between two atoms can be described using multipole expansion, in which the electron cloud of an atom can be expanded as a series of multipole moments including charges, dipoles, quadrupoles, and octupoles etc. If only the charges (zero-moment) are considered, it is reduced to the point charge model in classical force fields:
where
More complex (and supposedly more accurate) force field can be obtained by including more multipoles with higher orders. Some force fields, such as MPID, goes as high as octupoles. Currently in DMFF, we support up to quadrupoles:
where
The
Different to charges, the definition of multipole moments depends on the coordinate system. The exact value of the moment tensor will be rotated in accord to different coordinate systems. There are three types of frames involved in DMFF, each used in a different scenario:
- Global frame: coordinate system binds to the simulation box. It is same for all the atoms. We use this frame to calculate the charge density structure factor
$S(\vec{k})$ in reciprocal space. - Local frame: this frame is defined differently on each atom, determined by the positions of its peripheral atoms. Normally, atomic multipole moments are most stable in the local frame, so it is the most suitable frame for force field input. In DMFF API, the local frames are defined using the same way as the AMOEBA plugin in OpenMM. The details can found in the following references:
- OpenMM forcefield.py, line 4894~4933
- J. Chem. Theory Comput. 2013, 9, 9, 4046–4063
- Quasi internal frame, aka. QI frame: this frame is defined for each pair of interaction sites, in which the z-axis is pointing from one site to another. In this frame, the real-space interaction tensor (
$T_{tu}^{AB}$ ) can be greatly simplified due to symmetry. We thus use this frame in the real space calculation of PME.
DMFF supports polarizable force fields, in which the dipole moment of the atom can respond to the change of the external electric field. In practice, each atom has not only permanent multipoles
Other damping functions between multipole moments can be found in Ref 6, table I.
It is noted that the atomic damping parameter dmff.admp.pme.DEFAULT_THOLE_WIDTH
variable, which is set to 5.0 by default.
We solve
The last two terms are related to
where the off-diagonal term of
In the current version, we temporarily assume that the polarizability is spherically symmetric, thus the polarizability polarizabilityXX, polarizabilityYY, polarizabilityZZ
) in the xml API is averaged internally. In future, it is relatively simple to relax this restriction: simply change the reciprocal of the polarizability to the inverse of the matrix when calculating the diagonal terms of the
In ADMP, we assume that the following expansion is used for the long-range dispersion interaction:
where the dispersion coefficients are determined by the following combination rule:
Note that the dispersion terms should be consecutive even powers according to the perturbation theory, so the odd dispersion terms are not supported in ADMP.
In ADMP, this long-range dispersion is computed using PME (vida infra), just as electrostatic terms.
In the classical module, dispersions are treated as short-range interactions using standard cutoff scheme.
The long-range potential includes electrostatic, polarization, and dispersion (in ADMP) interactions. Taking charge-charge interaction as example, the interaction decays in the form of
In PME, the interaction tensor is splitted into the short-range part and the long-range part, which are tackled in real space and reciprocal space, respectively. For example, the Coulomb interaction is decomposed as:
The first term is a short-range term, which can be calculated directly by using a simple distance cutoff in real space. The second term is a long-range term, which needs to be calculated in reciprocal space by fast Fourier transform(FFT). The total energy of charge-charge interaction is computed as:
As for multipolar PME and dispersion PME, the users and developers are referred to Ref 2, 3, and 5 for mathematical details.
The key parameters in PME include:
-
$\kappa$ : controls the separation of the long-range and the short-range. The larger$\kappa$ is, the faster the real space energy decays, the smaller the cutoff distance can be used in the real space, and more difficult it is to converge the reciprocal energy and the larger$K_{max}$ it needs; -
$r_{c}$ : cutoff distance in real space; -
$K_{max}$ : controls the number of maximum k-points in all three dimensions
In DMFF, we determine these parameters in the same way as in OpenMM:
where the user needs to specify the cutoff distance
In the current version, the dispersion PME calculator uses the same parameters as in electrostatic PME.
Short-range pair interaction refers to all interactions with the following form:
Some common short-range pair interactions include:
- The repulsive part of the Buckingham or the Lennard-Jones potential:
- Tang-Tonnies Damping: damping function for short-range electrostatic and dispersion energies. $$ f_n(r, \beta) = 1 - e^{-\beta r}\sum_{k=0}^n {\frac{(\beta r)^k}{k!}} $$
In ADMP, the user can define a pairwise kernel function generate_pairwise_interaction
to raise the kernel function into an energy calculator (see details in ADMP manual).
For most traditional force fields, pairwise parameters between interacting particles are determined by atomic parameters. This mathematical relationship is called the combination rule. For example, in the calculation of LJ potential, the following combination rule may be used:
In ADMP module, we do not make any assumptions about the specific mathematical forms of the combination rule and
All DMFF real space calculations depends on neighbor list (or "pair list" as we sometimes call in DMFF). Its purpose is to keep a record of all the "neighbors" within a certain distance of the central atom, thus avoiding to go over all pairs explicitly.
In DMFF, we use external code (jax-md) to build such neighbor list. An input argument named pairs
is required in all real-space calculators, which contains the indices of all "interacting pairs" (i.e., pairs within a certain distance pairs
variable is in ordered sparse
format in Jax-md. That is, a
Since the pair list only provides atom id information, it does not take part in the differentiation process, so it can be fed in as a normal numpy array (instead of a jax numpy array).
In order to avoid double-counting with the bonding term, we often need to scale the non-bonding interactions between two atoms that are topologically connected. The scaling factor depends on the topological distance between the two atoms. We define two atoms separated by one bond as "1-2" interaction, and those separated by two bonds as "1-3" interaction, and so on. For example, in the OPLS-AA force field, all "1-2" nonbonding interactions are turned off completely, while all "1-3" non-bonding interactions are scaled by 50%. DMFF supports such feature, and important variables related to topological scaling include:
-
covalent_Map
: a$N\times N$ matrix, which defines the topological spacings between atoms. If the matrix element is 0, it indicates that the topological distance between the two atoms is too far (or the two atoms are not connected), so the nonbonding interaction is fully turned on between them. -
mScales
: The list of scaling factors. The first element is the scaling factor for all 1-2 nonbonding interaction, the second element is the scaling for 1-3 interactions, and so on. The list can be of any length, but the last number of the list must be 1, which represents the complete, unscaled nonbonding interaction. -
pScales
/dScales
: similar tomScales
, but only related to polarizable calculations. They are scaling factors for induced-perm and induced-induced interactions, respectively.
TODO:
Intramolecular bonding interactions refer to all interactions that depend on internal coordinates (IC), such as bonds, angles, and dihedrals, etc.
-
Harmonic Bonding Terms The definition of the bonding term in DMFF is the same as in OpenMM. For each bond, we have: $$ E=\frac{1}{2}k(x-x_0)^2 $$ Note prefactor
$1/2$ before the force constant. -
Harmonic Angle Terms we have: $$ E=\frac{1}{2} k\left(\theta-\theta_{0}\right)^{2} $$
-
Dihedral Terms
- Proper dihedral
- Improper dihedral
-
Multi IC coupling term
Before energy calculation, atomic and IC parameters (such as charge, multipole moment, dispersion coefficient, polarizability, force constant of each bond and angle, etc.) need to be assigned first.
Generally, these parameters should be dependent on the chemical and geometric environment of each atom and IC. However, in conventional force field, in order to reduce the number of parameters, atoms and ICs are classified according to their topological environment, and atoms/ICS in the same class would share parameters. The process of classifying each atom and IC and assigning the corresponding parameters according to their class is called typification.
In DMFF, the input parameters that need to be optimized are called force field parameters, and the parameters of each atom and IC after typification are called atomic parameters. Note that in an ideal force field, if we can directly predict atomic parameters using machine learning model, the process of typification is not necessary. Therefore, in DMFF, we decouple the typification code with the computation kernels, so that the core calculators based on atomic parameters has their own API and can be invoked independently. The typification code, in combination with the xml/pdb input parsers, composes the high-level API (the dmff.api
module).
The design of the high-level DMFF API is based on the existing framework of OpenMM. DMFF needs to keep the derivation chain uninterrupted when dispatching the force field params into atomic params. Therefore, maintaining the basic design logic of OpenMM, we rewrite the typification part of OpenMM using Jax. Briefly speaking, OpenMM/DMFF requires the users to clearly define the type of each atom in each residue and the connection mode between atoms in residue templates. Then the residue templates are used to match the PDB file to typify the whole system. See the following documents for details.