diff --git a/CHANGES b/CHANGES index c64a912b..ebea68a6 100644 --- a/CHANGES +++ b/CHANGES @@ -20,8 +20,9 @@ The rules for this file: * 0.?.? Enhancements - - Allow the overlap matrix of the MBAR estimator to be plotted. (PR #107) - - Allow the dhdl of the TI estimator to be plotted. (PR #110) + - Allow the overlap matrix of the MBAR estimator to be plotted. (issue #73, PR #107) + - Allow the dhdl of the TI estimator to be plotted. (issue #73, PR #110) + - Allow the dF states to be plotted. (issue #73, PR #112) Deprecations diff --git a/docs/images/dF_states.png b/docs/images/dF_states.png new file mode 100644 index 00000000..da608194 Binary files /dev/null and b/docs/images/dF_states.png differ diff --git a/docs/visualisation.rst b/docs/visualisation.rst index 8d67d3b3..549631c0 100644 --- a/docs/visualisation.rst +++ b/docs/visualisation.rst @@ -11,6 +11,7 @@ visualisation tools to help user to judge the estimate. plot_mbar_overlap_matrix plot_ti_dhdl + plot_dF_state .. _plot_overlap_matrix: @@ -40,7 +41,12 @@ the degree of overlap. It is recommended that there should be at least Will give a plot looks like this -.. image:: images/O_MBAR.png +.. figure:: images/O_MBAR.png + + Overlap between the distributions of potential energy differences is + essential for accurate free energy calculations and can be quantified by + computing the overlap matrix 𝐎. Its elements 𝑂𝑖𝑗 are the probabilities of + observing a sample from state i (𝑖 th row) in state j (𝑗 th column). .. _plot_TI_dhdl: @@ -72,7 +78,59 @@ together as well. :: Will give a plot looks like this -.. image:: images/dhdl_TI.png +.. figure:: images/dhdl_TI.png + + A plot of ⟨∂𝑈/∂𝜆⟩ versus 𝜆 for thermodynamic integration, with filled areas + indicating free energy estimates from the trapezoid rule. Different 𝛥𝐺 + components are shown in distinct colors: in red is the electrostatic 𝛥𝐺 + component (𝜆 indices 0–4), while in green is the van der Waals 𝛥𝐺 component + (𝜆 indices 5–19). Color intensity alternates with increasing 𝜆 index. + +.. _plot_dF_states: + +dF States Plots between Different estimators +-------------------------------------------- +Another way of assessing the quality of free energy estimate would be comparing +the free energy difference between adjacent lambda states (dF) using different +estimators [Klimovich2015]_. The function :func:`~alchemlyb.visualisation.plot_dF_state` can +be used, for example, to compare the dF of both Coulombic and VDW +transformations using :class:`~alchemlyb.estimators.TI`, +:class:`~alchemlyb.estimators.BAR` and :class:`~alchemlyb.estimators.MBAR` +estimators. :: + + >>> from alchemtest.gmx import load_benzene + >>> from alchemlyb.parsing.gmx import extract_u_nk, extract_dHdl + >>> from alchemlyb.estimators import MBAR, TI, BAR + >>> import matplotlib.pyplot as plt + >>> import pandas as pd + >>> from alchemlyb.visualisation.dF_state import plot_dF_state + >>> bz = load_benzene().data + >>> u_nk_coul = pd.concat([extract_u_nk(xvg, T=300) for xvg in bz['Coulomb']]) + >>> dHdl_coul = pd.concat([extract_dHdl(xvg, T=300) for xvg in bz['Coulomb']]) + >>> u_nk_vdw = pd.concat([extract_u_nk(xvg, T=300) for xvg in bz['VDW']]) + >>> dHdl_vdw = pd.concat([extract_dHdl(xvg, T=300) for xvg in bz['VDW']]) + >>> ti_coul = TI().fit(dHdl_coul) + >>> ti_vdw = TI().fit(dHdl_vdw) + >>> bar_coul = BAR().fit(u_nk_coul) + >>> bar_vdw = BAR().fit(u_nk_vdw) + >>> mbar_coul = MBAR().fit(u_nk_coul) + >>> mbar_vdw = MBAR().fit(u_nk_vdw) + + >>> estimators = [(ti_coul, ti_vdw), + (bar_coul, bar_vdw), + (mbar_coul, mbar_vdw),] + + >>> fig = plot_dF_state(estimators, orientation='portrait') + >>> fig.savefig('dF_state.pdf', bbox_inches='tight') + + +Will give a plot looks like this + +.. figure:: images/dF_states.png + + A bar plot of the free energy differences evaluated between pairs of adjacent + states via several methods, with corresponding error estimates for each method. + .. [Klimovich2015] Klimovich, P.V., Shirts, M.R. & Mobley, D.L. Guidelines for the analysis of free energy calculations. J Comput Aided Mol Des 29, 397–411 diff --git a/docs/visualisation/alchemlyb.visualisation.plot_dF_state.rst b/docs/visualisation/alchemlyb.visualisation.plot_dF_state.rst new file mode 100644 index 00000000..d13c1e32 --- /dev/null +++ b/docs/visualisation/alchemlyb.visualisation.plot_dF_state.rst @@ -0,0 +1,35 @@ +.. _visualisation_plot_dF_state: + +Plot dF states from multiple estimators +======================================= + +The function :func:`~alchemlyb.visualisation.plot_dF_state` allows the user to +plot and compare the free energy difference between states ("dF") from various +kinds of :class:`~alchemlyb.estimators`. + +To compare the dF states of a single alchemical transformation among various +:class:`~alchemlyb.estimators`, the user can pass a list of `estimators`. (e.g. +`estimators` = [:class:`~alchemlyb.estimators.TI`, +:class:`~alchemlyb.estimators.BAR`, :class:`~alchemlyb.estimators.MBAR`]) + +To compare the dF states of a multiple alchemical transformations, results from +the same :class:`~alchemlyb.estimators` can be concatenated into a list, which +is then bundled to to another list of different :class:`~alchemlyb.estimators`. +(e.g. `estimators` = [(:class:`~alchemlyb.estimators.TI`, +:class:`~alchemlyb.estimators.TI`), (:class:`~alchemlyb.estimators.BAR`, +:class:`~alchemlyb.estimators.BAR`), (:class:`~alchemlyb.estimators.MBAR`, +:class:`~alchemlyb.estimators.MBAR`)]) + +The figure could be plotted in *portrait* or *landscape* mode by setting the +`orientation`. `nb` is used to control the number of dF states in one row. +The user could pass a list of strings to `labels` to name the +:class:`~alchemlyb.estimators` or a list of strings to `colors` to color +the estimators differently. The unit in the y axis could be labelled to other +units by setting `units`, which by default is kcal/mol. + +Please check :ref:`How to plot dF states ` for a complete +example. + +API Reference +------------- +.. autofunction:: alchemlyb.visualisation.plot_dF_state diff --git a/src/alchemlyb/tests/test_visualisation.py b/src/alchemlyb/tests/test_visualisation.py index 06808ea6..94d09ad4 100644 --- a/src/alchemlyb/tests/test_visualisation.py +++ b/src/alchemlyb/tests/test_visualisation.py @@ -2,12 +2,14 @@ import matplotlib.pyplot as plt import pandas as pd import numpy as np +import pytest from alchemtest.gmx import load_benzene from alchemlyb.parsing.gmx import extract_u_nk, extract_dHdl -from alchemlyb.estimators import MBAR, TI +from alchemlyb.estimators import MBAR, TI, BAR from alchemlyb.visualisation.mbar_matrix import plot_mbar_overlap_matrix from alchemlyb.visualisation.ti_dhdl import plot_ti_dhdl +from alchemlyb.visualisation.dF_state import plot_dF_state def test_plot_mbar_omatrix(): '''Just test if the plot runs''' @@ -54,3 +56,43 @@ def test_plot_ti_dhdl(): assert isinstance(plot_ti_dhdl(ti_coul), matplotlib.axes.Axes) +def test_plot_dF_state(): + '''Just test if the plot runs''' + bz = load_benzene().data + u_nk_coul = pd.concat([extract_u_nk(xvg, T=300) for xvg in bz['Coulomb']]) + dHdl_coul = pd.concat([extract_dHdl(xvg, T=300) for xvg in bz['Coulomb']]) + u_nk_vdw = pd.concat([extract_u_nk(xvg, T=300) for xvg in bz['VDW']]) + dHdl_vdw = pd.concat([extract_dHdl(xvg, T=300) for xvg in bz['VDW']]) + + ti_coul = TI().fit(dHdl_coul) + ti_vdw = TI().fit(dHdl_vdw) + bar_coul = BAR().fit(u_nk_coul) + bar_vdw = BAR().fit(u_nk_vdw) + mbar_coul = MBAR().fit(u_nk_coul) + mbar_vdw = MBAR().fit(u_nk_vdw) + + dhdl_data = [(ti_coul, ti_vdw), + (bar_coul, bar_vdw), + (mbar_coul, mbar_vdw), ] + fig = plot_dF_state(dhdl_data, orientation='portrait') + assert isinstance(fig, matplotlib.figure.Figure) + fig = plot_dF_state(dhdl_data, orientation='landscape') + assert isinstance(fig, matplotlib.figure.Figure) + fig = plot_dF_state(dhdl_data, labels=['MBAR', 'TI', 'BAR']) + assert isinstance(fig, matplotlib.figure.Figure) + with pytest.raises(ValueError): + fig = plot_dF_state(dhdl_data, labels=['MBAR', 'TI',]) + fig = plot_dF_state(dhdl_data, colors=['#C45AEC', '#33CC33', '#F87431']) + assert isinstance(fig, matplotlib.figure.Figure) + with pytest.raises(ValueError): + fig = plot_dF_state(dhdl_data, colors=['#C45AEC', '#33CC33']) + with pytest.raises(NameError): + fig = plot_dF_state(dhdl_data, orientation='xxx') + fig = plot_dF_state(ti_coul, orientation='landscape') + assert isinstance(fig, matplotlib.figure.Figure) + fig = plot_dF_state(ti_coul, orientation='portrait') + assert isinstance(fig, matplotlib.figure.Figure) + fig = plot_dF_state([ti_coul, bar_coul]) + assert isinstance(fig, matplotlib.figure.Figure) + fig = plot_dF_state([(ti_coul, ti_vdw)]) + assert isinstance(fig, matplotlib.figure.Figure) \ No newline at end of file diff --git a/src/alchemlyb/visualisation/__init__.py b/src/alchemlyb/visualisation/__init__.py index 5cf2c3f9..b7cf63cc 100644 --- a/src/alchemlyb/visualisation/__init__.py +++ b/src/alchemlyb/visualisation/__init__.py @@ -1,2 +1,3 @@ from .mbar_matrix import plot_mbar_overlap_matrix -from .ti_dhdl import plot_ti_dhdl \ No newline at end of file +from .ti_dhdl import plot_ti_dhdl +from .dF_state import plot_dF_state \ No newline at end of file diff --git a/src/alchemlyb/visualisation/dF_state.py b/src/alchemlyb/visualisation/dF_state.py new file mode 100644 index 00000000..569df9b7 --- /dev/null +++ b/src/alchemlyb/visualisation/dF_state.py @@ -0,0 +1,201 @@ +"""Functions for Plotting the dF states. + +To assess the quality of the free energy estimation, The dF between adjacent +lambda states can be ploted to assess the quality of the estimation. + +The code for producing the dF states plot is modified based on +: `Alchemical Analysis `_. + +""" + +import matplotlib.pyplot as plt +from matplotlib.font_manager import FontProperties as FP +import numpy as np + +from ..estimators import TI, BAR, MBAR + +def plot_dF_state(estimators, labels=None, colors=None, units='kcal/mol', + orientation='portrait', nb=10): + '''Plot the dhdl of TI. + + Parameters + ---------- + estimators : :class:`~alchemlyb.estimators` or list + One or more :class:`~alchemlyb.estimators`, where the + dhdl value will be taken from. For more than one estimators + with more than one alchemical transformation, a list of list format + is used. + labels : List + list of labels for labelling different estimators. + colors : List + list of colors for plotting different estimators. + units : str + The unit of the estimate. Default: 'kcal/mol' + orientation : string + The orientation of the figure. Can be `portrait` or `landscape` + nb : int + Maximum number of dF states in one row in the `portrait` mode + + Returns + ------- + matplotlib.figure.Figure + An Figure with the dF states drawn. + + Notes + ----- + The code is taken and modified from + : `Alchemical Analysis `_ + + ''' + try: + len(estimators) + except TypeError: + estimators = [estimators, ] + + formatted_data = [] + for dhdl in estimators: + try: + len(dhdl) + formatted_data.append(dhdl) + except TypeError: + formatted_data.append([dhdl, ]) + estimators = formatted_data + + # Get the dF + dF_list = [] + error_list = [] + max_length = 0 + for dhdl_list in estimators: + len_dF = sum([len(dhdl.delta_f_) - 1 for dhdl in dhdl_list]) + if len_dF > max_length: + max_length = len_dF + dF = [] + error = [] + for dhdl in dhdl_list: + for i in range(len(dhdl.delta_f_) - 1): + dF.append(dhdl.delta_f_.iloc[i, i+1]) + error.append(dhdl.d_delta_f_.iloc[i, i+1]) + dF_list.append(dF) + error_list.append(error) + + # Get the determine orientation + if orientation == 'landscape': + if max_length < 8: + fig, ax = plt.subplots(figsize=(8, 6)) + else: + fig, ax = plt.subplots(figsize=(max_length, 6)) + axs = [ax, ] + xs = [np.arange(max_length), ] + elif orientation == 'portrait': + if max_length < nb: + xs = [np.arange(max_length), ] + fig, ax = plt.subplots(figsize=(8, 6)) + axs = [ax, ] + else: + xs = np.array_split(np.arange(max_length), max_length / nb + 1) + fig, axs = plt.subplots(nrows=len(xs), figsize=(8, 6)) + mnb = max([len(i) for i in xs]) + else: + raise NameError("Not recognising {}, only supports 'landscape' or 'portrait'.".format(orientation)) + + # Sort out the colors + if colors is None: + colors_dict = {'TI': '#C45AEC', 'TI-CUBIC': '#33CC33', + 'DEXP': '#F87431', 'IEXP': '#FF3030', 'GINS': '#EAC117', + 'GDEL': '#347235', 'BAR': '#6698FF', 'UBAR': '#817339', + 'RBAR': '#C11B17', 'MBAR': '#F9B7FF'} + colors = [] + for dhdl in estimators: + dhdl = dhdl[0] + if isinstance(dhdl, TI): + colors.append(colors_dict['TI']) + elif isinstance(dhdl, BAR): + colors.append(colors_dict['BAR']) + elif isinstance(dhdl, MBAR): + colors.append(colors_dict['MBAR']) + else: + if len(colors) >= len(estimators): + pass + else: + raise ValueError( + 'Number of colors ({}) should be larger than the number of data ({})'.format( + len(colors), len(estimators))) + + # Sort out the labels + if labels is None: + labels = [] + for dhdl in estimators: + dhdl = dhdl[0] + if isinstance(dhdl, TI): + labels.append('TI') + elif isinstance(dhdl, BAR): + labels.append('BAR') + elif isinstance(dhdl, MBAR): + labels.append('MBAR') + else: + if len(labels) == len(estimators): + pass + else: + raise ValueError( + 'Length of labels ({}) should be the same as the number of data ({})'.format( + len(labels), len(estimators))) + + # Plot the figure + width = 1. / (len(estimators) + 1) + elw = 30 * width + ndx = 1 + for x, ax in zip(xs, axs): + lines = [] + for i, (dF, error) in enumerate(zip(dF_list, error_list)): + y = [dF[j] for j in x] + ye = [error[j] for j in x] + if orientation == 'landscape': + lw = 0.1 * elw + elif orientation == 'portrait': + lw = 0.05 * elw + line = ax.bar(x + len(lines) * width, y, width, + color=colors[i], yerr=ye, lw=lw, + error_kw=dict(elinewidth=elw, ecolor='black', + capsize=0.5 * elw)) + lines += (line[0],) + for dir in ['left', 'right', 'top', 'bottom']: + if dir == 'left': + ax.yaxis.set_ticks_position(dir) + else: + ax.spines[dir].set_color('none') + + if orientation == 'landscape': + plt.yticks(fontsize=8) + ax.set_xlim(x[0]-width, x[-1] + len(lines) * width) + plt.xticks(x + 0.5 * width * len(estimators), + tuple(['%d--%d' % (i, i + 1) for i in x]), fontsize=8) + elif orientation == 'portrait': + plt.yticks(fontsize=10) + ax.xaxis.set_ticks([]) + for i in x + 0.5 * width * len(estimators): + ax.annotate('$\mathrm{%d-%d}$' % (i, i + 1), xy=(i, 0), + xycoords=('data', 'axes fraction'), xytext=(0, -2), + size=10, textcoords='offset points', va='top', + ha='center') + ax.set_xlim(x[0]-width, x[-1]+len(lines)*width + (mnb - len(x))) + ndx += 1 + x = np.arange(max_length) + + ax = plt.gca() + + for tick in ax.get_xticklines(): + tick.set_visible(False) + if orientation == 'landscape': + leg = plt.legend(lines, labels, loc=3, ncol=2, prop=FP(size=10), + fancybox=True) + plt.title('The free energy change breakdown', fontsize=12) + plt.xlabel('States', fontsize=12, color='#151B54') + plt.ylabel('$\Delta G$ ' + units, fontsize=12, color='#151B54') + elif orientation == 'portrait': + leg = ax.legend(lines, labels, loc=0, ncol=2, + prop=FP(size=8), + title='$\mathrm{\Delta G\/%s\/}\mathit{vs.}\/\mathrm{lambda\/pair}$' % units, + fancybox=True) + + leg.get_frame().set_alpha(0.5) + return fig