Skip to content

Commit

Permalink
Merge pull request #398 from choderalab/multistate
Browse files Browse the repository at this point in the history
Bring multistate samplers into openmmtools
  • Loading branch information
jchodera authored Feb 3, 2019
2 parents 384c555 + 5b3a36e commit 8db070e
Show file tree
Hide file tree
Showing 26 changed files with 30,956 additions and 15 deletions.
21 changes: 19 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,10 +19,27 @@ Features include:
- enhanced sampling methods, including replica-exchange (REMD) and self-adjusted mixture sampling (SAMS)
- factories for generating [alchemically-modified](http://alchemistry.org) systems for absolute and relative free energy calculations
- a suite of test systems for benchmarking, validation, and debugging
- user-friendly storage interface layer to remove requirement that user know how to store all their data-types on disk
- user-friendly storage interface layer to remove requirement that user know how to store all their data-types on disk

See the [documentation](http://openmmtools.readthedocs.io) at [ReadTheDocs](http://openmmtools.readthedocs.io).

#### License

OpenMMTools is distributed under the MIT License.
OpenMMTools is distributed under the [MIT License](https://opensource.org/licenses/MIT).

#### Contributors

A complete list of contributors can be found [here](https://github.com/choderalab/openmmtools/graphs/contributors)

Major contributors include:

* Andrea Rizzi `<[email protected]>` (WCMC)
* John D. Chodera `<[email protected]>` (MSKCC)
* Levi N. Naden `<[email protected]>` (MSKCC)
* Patrick Grinaway `<[email protected]>` (MSKCC)
* Kyle A. Beauchamp `<[email protected]>` (MSKCC)
* Josh Fass `<[email protected]>` (MSKCC)
* Bas Rustenburg `<[email protected]>` (MSKCC)
* Gregory Ross `<[email protected]>` (MSKCC)
* David W.H. Swenson `<[email protected]>`
* Hannah Bruce Macdonald `<hannah.brucemacdonald>` (MSKCC)
14 changes: 10 additions & 4 deletions devtools/conda-recipe/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -13,19 +13,25 @@ requirements:
build:
- python
- setuptools
- openmm ==7.3
- openmm >=7.3
- cython

run:
- python
- numpy
- scipy
- six
- openmm ==7.3
- openmm >=7.3
- parmed
- mdtraj
- netcdf4
- netcdf4 >=1.4.2 # after bugfix: "always return masked array by default, even if there are no masked values"
- libnetcdf >=4.6.2 # workaround for libssl issues
- pyyaml
- cython
- sphinxcontrib-bibtex
- mpiplus
- pymbar
- pyyaml


test:
requires:
Expand Down
2 changes: 2 additions & 0 deletions docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@
import os
import sys
sys.path.insert(0, os.path.abspath('..'))
import sphinx_rtd_theme


# -- General configuration ------------------------------------------------
Expand All @@ -40,6 +41,7 @@
'sphinx.ext.todo',
'sphinx.ext.coverage',
'sphinx.ext.viewcode',
'sphinxcontrib.bibtex',
#'sphinx.ext.githubpages'
]

Expand Down
8 changes: 6 additions & 2 deletions docs/environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,16 +2,20 @@ name: openmmtools
channels:
- conda-forge
- omnia
- omnia/label/rc
dependencies:
- python
- setuptools
- openmm >=7.3
- cython
- numpy
- scipy
- six
- parmed
- mdtraj
- numpydoc
- netCDF4
- netcdf4 >=1.4.2 # after bugfix: "always return masked array by default, even if there are no masked values"
- libnetcdf >=4.6.2 # workaround for libssl issues
- sphinxcontrib-bibtex
- mpiplus
- pymbar
- pyyaml
2 changes: 1 addition & 1 deletion docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ Modules
states
cache
mcmc
sampling
multistate
alchemy
forces
forcefactories
Expand Down
170 changes: 170 additions & 0 deletions docs/multistate.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,170 @@
.. _multistate:

Sampling multiple thermodynamic states
======================================

``openmmtools`` provides several schemes for sampling from multiple thermodynamic states within a single calculation:

* ``MultistateSampler``: Independent simulations at distinct thermodynamic states
* ``ReplicaExchangeSampler``: Replica exchange among thermodynamic states (also called Hamiltonian exchange if only the Hamiltonian is changing)
* ``SAMSSampler``: Self-adjusted mixture sampling (also known as optimally-adjusted mixture sampling)

While the thermodynamic states sampled usually differ only in the alchemical parameters, other thermodynamic parameters (such as temperature) can be modulated as well at intermediate alchemical states.
This may be useful in, for example, experimenting with ways to reduce correlation times.

In all of these schemes, one or more **replicas** is simulated.
Each iteration includes the following phases:
* Allow replicas to switch thermodynamic states (optional)
* Allow replicas to sample a new configuration using Markov chain Monte Carlo (MCMC)
* Each replica computes the potential energy of the current configuration in multiple thermodynamic states
* Data is written to disk

Below, we describe some of the aspects of these samplers.

``MultiStateSampler``: Independent simulations at multiple thermodynamic states
-------------------------------------------------------------------------------

The ``MultiStateSampler`` allows independent simulations from multiple thermodynamic states to be sampled.
In this case, the MCMC scheme is used to propagate each replica by sampling from a fixed thermodynamic state.

.. math::
s_{k,n+1} = s_{k, n} \\
x_{k,n+1} \sim p(x | s_{k, n+1})
An inclusive "neighborhood" of thermodynamic states around this specified state can be used to define which thermodynamic states the reduced potential should be computed for after each iteration.
If all thermodynamic states are included in this neighborhood (the default), the MBAR scheme :cite:`Shirts2008statistically` can be used to optimally estimate free energies and uncertainties.
If a restricted neighborhood is used (in order to reduce the amount of time spent in the energy evaluation stage), a variant of the L-WHAM (local weighted histogram analysis method) :cite:`kumar1992weighted` is used to extract an estimate from all available information.

.. currentmodule:: openmmtools.multistate
.. autosummary::
:nosignatures:
:toctree: api/generated/

MultiStateSampler
MultiStateSamplerAnalyzer

``ReplicaExchangeSampler``: Replica exchange among thermodynamic states
-----------------------------------------------------------------------

The ``ReplicaExchangeSampler`` implements a Hamiltonian replica exchange scheme with Gibbs sampling :cite:`Chodera2011` to sample multiple thermodynamic states in a manner that improves mixing of the overall Markov chain.
By allowing replicas to execute a random walk in thermodynamic state space, correlation times may be reduced when sampling certain thermodynamic states (such as those with alchemically-softened potentials or elevated temperatures).

In the basic version of this scheme, a proposed swap of configurations between two alchemical states, *i* and *j*, made by comparing the energy of each configuration in each replica and swapping with a basic Metropolis criteria of

.. math::
P_{\text{accept}}(i, x_i, j, x_j) &= \text{min}\begin{cases}
1, \frac{ e^{-\left[u_i(x_j) + u_j(x_i)\right]}}{e^{-\left[u_i(x_i) + u_j(x_j)\right]}}
\end{cases} \\
&= \text{min}\begin{cases}
1, \exp\left[\Delta u_{ji}(x_i) + \Delta u_{ij}(x_j)\right]
\end{cases}
where :math:`x` is the configuration of the subscripted states :math:`i` or :math:`j`, and :math:`u` is the reduced potential energy.
While this scheme is typically carried out on neighboring states only, we also implement a much more efficient form of Gibbs sampling in which many swaps are attempted to generate an approximately uncorrelated sample of the state permutation over all :math:`K` :cite:`Chodera2011`.
This speeds up mixing and reduces the total number of samples needed to produce uncorrelated samples.

.. currentmodule:: openmmtools.multistate
.. autosummary::
:nosignatures:
:toctree: api/generated/

ReplicaExchangeSampler
ReplicaExchangeAnalyzer

``SAMSSampler``: Self-adjusted mixture sampling
-----------------------------------------------

The ``SAMSSampler`` implements self-adjusted mixture sampling (SAMS; also known as optimally adjusted mixture sampling) :cite:`Tan2017:SAMS`.
This combines one or more replicas that sample from an expanded ensemble with an asymptotically optimal Wang-Landau-like weight update scheme.

.. math::
s_{k,n+1} = p(s | x_{k,n}) \\
x_{k,n+1} \sim p(x | s_{k, n+1})
SAMS state update schemes
^^^^^^^^^^^^^^^^^^^^^^^^^

Several state update schemes are available:

* ``global-jump`` (default): The sampler can jump to any thermodynamic state (RECOMMENDED)
* ``restricted-range-jump``: The sampler can jump to any thermodynamic state within the specified local neighborhood (EXPERIMENTAL; DISABLED)
* ``local-jump``: Only proposals within the specified neighborhood are considered, but rejection rates may be high (EXPERIMENTAL; DISABLED)

SAMS Locality
^^^^^^^^^^^^^

The local neighborhood is specified by the ``locality`` parameter.
If this is a positive integer, the neighborhood will be defined by state indices ``[k - locality, k + locality]``.
Reducing locality will restrict the range of states for which reduced potentials are evaluated, which can speed up the energy evaluation stage of each iteration at the cost of restricting the amount of information available for free energy estimation.
By default, the ``locality`` is global, such that energies at all thermodynamic states are computed; this allows the use of MBAR in data analysis.

SAMS weight adaptation algorithm
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

SAMS provides two ways of accumulating log weights each iteration:

* ``optimal`` accumulates weight only in the currently visited state ``s``
* ``rao-blackwellized`` accumulates fractional weight in all states within the energy evaluation neighborhood

SAMS initial weight adaptation stage
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Because the asymptotically-optimal weight adaptation scheme works best only when the log weights are close to optimal, a heuristic initial stage is used to more rapidly adapt the log weights before the asymptotically optimal scheme is used.
The behavior of this first stage can be controlled by setting two parameters:

* ``gamma0`` controls the initial rate of weight adaptation. By default, this is 1.0, but can be set larger (e.g., 10.0) if the free energy differences between states are much larger.
* ``flatness_threshold`` controls the number of (fractional) visits to each thermodynamic state that must be accumulated before the asymptotically optimal weight adaptation scheme is used.

.. currentmodule:: openmmtools.multistate
.. autosummary::
:nosignatures:
:toctree: api/generated/

SAMSSampler
SAMSAnalyzer

Parallel tempering
------------------

.. currentmodule:: openmmtools.multistate
.. autosummary::
:nosignatures:
:toctree: api/generated/

ParallelTemperingSampler
ParallelTemperingAnalyzer

Multistate Reporters
--------------------

.. currentmodule:: openmmtools.multistate
.. autosummary::
:nosignatures:
:toctree: api/generated/

MultiStateReporter

Analysis of multiple thermodynamic transformations
--------------------------------------------------

.. currentmodule:: openmmtools.multistate
.. autosummary::
:nosignatures:
:toctree: api/generated/

MultiPhaseAnalyzer

Miscellaneous support classes
-----------------------------

.. currentmodule:: openmmtools.multistate.multistateanalyzer
.. autosummary::
:nosignatures:
:toctree: api/generated/

ObservablesRegistry
CachedProperty
InsufficientData
PhaseAnalyzer
45 changes: 45 additions & 0 deletions docs/references.bib
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
@article{Chodera2011,
author = {Chodera, John D. and Shirts, Michael R.},
title = {Replica exchange and expanded ensemble simulations as Gibbs sampling: Simple improvements for enhanced mixing},
journal = {The Journal of Chemical Physics},
year = {2011},
volume = {135},
number = {19},
eid = {194110},
url = {http://scitation.aip.org/content/aip/journal/jcp/135/19/10.1063/1.3660669},
doi = {http://dx.doi.org/10.1063/1.3660669},
}

@article{Tan2017:SAMS,
title={Optimally adjusted mixture sampling and locally weighted histogram analysis},
author={Tan, Zhiqiang},
journal={Journal of Computational and Graphical Statistics},
volume={26},
number={1},
pages={54--65},
year={2017},
publisher={Taylor \& Francis}
}


@article{Shirts2008statistically,
title={Statistically optimal analysis of samples from multiple equilibrium states},
author={Shirts, Michael R and Chodera, John D},
journal={The Journal of chemical physics},
volume={129},
number={12},
pages={124105},
year={2008},
publisher={AIP}
}

@article{kumar1992weighted,
title={The weighted histogram analysis method for free-energy calculations on biomolecules. I. The method},
author={Kumar, Shankar and Rosenberg, John M and Bouzida, Djamal and Swendsen, Robert H and Kollman, Peter A},
journal={Journal of computational chemistry},
volume={13},
number={8},
pages={1011--1021},
year={1992},
publisher={Wiley Online Library}
}
64 changes: 64 additions & 0 deletions docs/references.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
.. _references:

**********
References
**********

Here are a list of references for the various components and algorithms used in ``openmmtools``.

OpenMM GPU-accelerated molecular mechanics library
""""""""""""""""""""""""""""""""""""""""""""""""""

Friedrichs MS, Eastman P, Vaidyanathan V, Houston M, LeGrand S, Beberg AL, Ensign DL, Bruns CM, and Pande VS. Accelerating molecular dynamic simulations on graphics processing units.
J. Comput. Chem. 30:864, 2009.
http://dx.doi.org/10.1002/jcc.21209

Eastman P and Pande VS. OpenMM: A hardware-independent framework for molecular simulations.
Comput. Sci. Eng. 12:34, 2010.
http://dx.doi.org/10.1109/MCSE.2010.27

Eastman P and Pande VS. Efficient nonbonded interactions for molecular dynamics on a graphics processing unit.
J. Comput. Chem. 31:1268, 2010.
http://dx.doi.org/10.1002/jcc.21413

Eastman P and Pande VS. Constant constraint matrix approximation: A robust, parallelizable constraint method for molecular simulations.
J. Chem. Theor. Comput. 6:434, 2010.
http://dx.doi.org/10.1021/ct900463w

Eastman P, Friedrichs M, Chodera JD, Radmer RJ, Bruns CM, Ku JP, Beauchamp KA, Lane TJ, Wang LP, Shukla D, Tye T, Houston M, Stich T, Klein C, Shirts M, and Pande VS. OpenMM 4: A Reusable, Extensible,
Hardware Independent Library for High Performance Molecular Simulation. J. Chem. Theor. Comput. 2012.
http://dx.doi.org/10.1021/ct300857j

Replica-exchange with Gibbs sampling
""""""""""""""""""""""""""""""""""""

Chodera JD and Shirts MR. Replica exchange and expanded ensemble simulations as Gibbs sampling: Simple improvements for enhanced mixing.
J. Chem. Phys. 135:19410, 2011.
http://dx.doi.org/10.1063/1.3660669

MBAR for estimation of free energies from simulation data
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""

Shirts MR and Chodera JD. Statistically optimal analysis of samples from multiple equilibrium states.
J. Chem. Phys. 129:124105, 2008.
http://dx.doi.org/10.1063/1.2978177

Long-range dispersion corrections for explicit solvent free energy calculations
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""

Shirts MR, Mobley DL, Chodera JD, and Pande VS. Accurate and efficient corrections or missing dispersion interactions in molecular simulations.
J. Phys. Chem. 111:13052, 2007.
http://dx.doi.org/10.1021/jp0735987


Bibliography
############

.. The :all: directive searches subfolders for uses of :cite: for correct reference
However, this has the effect of dropping all citations in the .bib file in here and
the compiler complains about unused citations.
As such, unused articles in the .bib file are simply commented so as not to delete them if needed in the future.
.. bibliography:: references.bib
:style: unsrt
:all:
Loading

0 comments on commit 8db070e

Please sign in to comment.