O_Matrix from TI simulation of AMBER with the MBAR #121

geraili-hosein · 2020-02-21T10:08:12Z

Hi all,
I am mutating a residue of a protein using TI in AMBER, and outputting at the same time the MBAR data. In the end, I am using alchemical_analysis, but I have a problem. I know that usually for TI we consider the dhdl curve as the criteria, and in my simulations, they are for some parts (charging and decharging) good, and for vdW there are kinks, but the O_Matrix always has some empty squares or low values. It seems that it never works. My question is that how much the O_matrix is a reliable validation in this case? The standard deviation for three exactly the same simulation is close to 1 kcal/mol. I just wanted to know, what criteria might be the best to say that the free energy values are reliable?

Best

davidlmobley · 2020-02-23T03:53:35Z

The overlap matrix applies to MBAR, not TI; for TI, one needs a smooth integrand. So those are very different issues.

That said, often if things are well-behaved your TI, BAR and MBAR results will agree with one another; if they disagree it's a warning sign.

geraili91 · 2021-05-03T08:57:41Z

The issue is, the curves are usually smooth, and BAR and TI-3 match very well, but most of the time MBAR is different mostly around 1.5 to 2 kcal/mol, maybe sometimes even to 3. My issue is increasing the number of lambdas, and simulation time never solves the problem.

This is when I use the UNCORR_THRESHOLD as it is, it changes when I use either no (i=1000000, where O_matrix is perfect, more BAR-MBAR difference) or only (i=1, less BAR-MBAR difference) uncorrelated data. Any suggestion?

mrshirts · 2021-05-03T15:21:29Z

You would need at least ~50 uncorrelated samples to get converged results. You should be able to force the number of uncorrelated samples to be higher, and see if the results are consistent - they might have artificially low uncertainties, but they should all be consistent - which they seem to be.

It SOUNDS like for whatever reason BAR and MBAR are using slightly different data sets. If they are only using 1-3 data points for each lambda, and they are using different 1-3 data points, then of course the answers will be different.

If your number of uncorrelated data samples is 1.53, then that would indicate there is something odd with the data, such as a dU/dl that is decreasing throughout the entire simulation, or that the timeseries is in some other way nonequilibrated. This will also lead to incorrect answers. You should visualize your data set in various ways to understand why it has so few uncorrelated data points.

geraili91 · 2021-05-03T15:35:44Z

I am sorry, I made a mistake in saying the number of uncorrelated data points. It is usually for most of the states more than 50, but for some few states, it is as low as 20 where alchemcial_analysis uses the correlated data.
What do you mean by 1-3 data points?
With what you described BAR and MBAR difference, does it make sense to consider the free energy correct when BAR and TI-3 match and MBAR is somewhat different due to the fact that some states use correlated data?

This difference happens even when the number of uncorrelated data is very high.

mrshirts · 2021-05-03T16:06:43Z

With what you described BAR and MBAR difference, does it make sense to

consider the free energy correct when BAR and TI-3 match and MBAR is somewhat different due to the fact that some states use correlated data? If they don't all agree, it's usually a sign that something is suspicious with the data. MBAR handles correlated data better than BAR.

What do you mean by 1-3 data points?

Because that's what I understood you said. You said:

Something that I just noticed is that N/N_K(number of uncorrelated) is

very very low in the output such as 1.53 out of 1894.

But for some few states, it is as low as 20 where alchemcial_analysis

uses the correlated data. That's getting to be a low enough number that statistical errors may cause differences between methods.

…

On Mon, May 3, 2021 at 9:36 AM geraili91 ***@***.***> wrote: I am sorry, I made a mistake in saying the number of uncorrelated data points. It is usually for most of the states more than 50, but for some few states, it is as low as 20 where alchemcial_analysis uses the correlated data. What do you mean by 1-3 data points? With what you described BAR and MBAR difference, does it make sense to consider the free energy correct when BAR and TI-3 match and MBAR is somewhat different due to the fact that some states use correlated data? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#121 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABATPVG3SYPGNROX2AJOR7DTL27GLANCNFSM4KZADZFA> .

geraili91 · 2021-05-03T18:24:27Z

I see this usually disagreement even in cases where I have enough uncorrelated data like the results below.

How much of this disagreement is tolerable for a big 100 A system, Asn charged to Asn uncharged attached to a protein with explicit water with soft-core potential, normal?

States TI (kcal/mol) TI-CUBIC (kcal/mol) BAR (kcal/mol) MBAR (kcal/mol)
TOTAL: 59.83754 +- 0.05658 59.76601 +- 0.05768 59.73252 +- 0.03717 56.98357 +- 0.07534

mrshirts · 2021-05-03T18:26:52Z

Really impossible to say without having an example to run. Something suspicious is going on, but there are too many potential suspects and we don't have enough information about the underlying data.

geraili91 · 2021-05-03T18:41:47Z

run.log

The outputs of my file also have "*****" for many of the MBAR energies at lambda 1.

mrshirts · 2021-05-03T19:13:34Z

run.log

<https://github.com/MobleyLab/alchemical-analysis/files/6416927/run.log> So in this case, BAR = -135.39 +- 0.09 MBAR = -134.76 +- 0.18 TI = -135.91 +- 0.14 This spread is definitely a little bit high but not drastically so - (0.63/0.20 = 3 sigma between BAR and MBAR). It might be explained by the lack of large numbers of correlated samples at some data points. I'd look at the time series of dH/dL to see if anything weird is going on.

The outputs of my file also have "*****" for many of the MBAR energies at

lambda 1. I'm not sure what file you are referring to. If AMBER, I don't really know how AMBER handles the output formatting.

…

On Mon, May 3, 2021 at 12:42 PM geraili91 ***@***.***> wrote: run.log <https://github.com/MobleyLab/alchemical-analysis/files/6416927/run.log> The outputs of my file also have "*****" for many of the MBAR energies at lambda 1. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#121 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABATPVEFFMOTA7XHCDJVKBTTL3U73ANCNFSM4KZADZFA> .

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

O_Matrix from TI simulation of AMBER with the MBAR #121

O_Matrix from TI simulation of AMBER with the MBAR #121

geraili-hosein commented Feb 21, 2020

davidlmobley commented Feb 23, 2020

geraili91 commented May 3, 2021 •

edited

Loading

mrshirts commented May 3, 2021

geraili91 commented May 3, 2021 •

edited

Loading

mrshirts commented May 3, 2021 via email

geraili91 commented May 3, 2021

mrshirts commented May 3, 2021

geraili91 commented May 3, 2021

mrshirts commented May 3, 2021 via email

O_Matrix from TI simulation of AMBER with the MBAR #121

O_Matrix from TI simulation of AMBER with the MBAR #121

Comments

geraili-hosein commented Feb 21, 2020

davidlmobley commented Feb 23, 2020

geraili91 commented May 3, 2021 • edited Loading

mrshirts commented May 3, 2021

geraili91 commented May 3, 2021 • edited Loading

mrshirts commented May 3, 2021 via email

geraili91 commented May 3, 2021

mrshirts commented May 3, 2021

geraili91 commented May 3, 2021

mrshirts commented May 3, 2021 via email

geraili91 commented May 3, 2021 •

edited

Loading

geraili91 commented May 3, 2021 •

edited

Loading