-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
O_Matrix from TI simulation of AMBER with the MBAR #121
Comments
The overlap matrix applies to MBAR, not TI; for TI, one needs a smooth integrand. So those are very different issues. That said, often if things are well-behaved your TI, BAR and MBAR results will agree with one another; if they disagree it's a warning sign. |
The issue is, the curves are usually smooth, and BAR and TI-3 match very well, but most of the time MBAR is different mostly around 1.5 to 2 kcal/mol, maybe sometimes even to 3. My issue is increasing the number of lambdas, and simulation time never solves the problem. This is when I use the UNCORR_THRESHOLD as it is, it changes when I use either no (i=1000000, where O_matrix is perfect, more BAR-MBAR difference) or only (i=1, less BAR-MBAR difference) uncorrelated data. Any suggestion? |
You would need at least ~50 uncorrelated samples to get converged results. You should be able to force the number of uncorrelated samples to be higher, and see if the results are consistent - they might have artificially low uncertainties, but they should all be consistent - which they seem to be. It SOUNDS like for whatever reason BAR and MBAR are using slightly different data sets. If they are only using 1-3 data points for each lambda, and they are using different 1-3 data points, then of course the answers will be different. If your number of uncorrelated data samples is 1.53, then that would indicate there is something odd with the data, such as a dU/dl that is decreasing throughout the entire simulation, or that the timeseries is in some other way nonequilibrated. This will also lead to incorrect answers. You should visualize your data set in various ways to understand why it has so few uncorrelated data points. |
I am sorry, I made a mistake in saying the number of uncorrelated data points. It is usually for most of the states more than 50, but for some few states, it is as low as 20 where alchemcial_analysis uses the correlated data. This difference happens even when the number of uncorrelated data is very high. |
With what you described BAR and MBAR difference, does it make sense to
consider the free energy correct when BAR and TI-3 match and MBAR is
somewhat different due to the fact that some states use correlated data?
If they don't all agree, it's usually a sign that something is suspicious
with the data. MBAR handles correlated data better than BAR.
What do you mean by 1-3 data points?
Because that's what I understood you said. You said:
Something that I just noticed is that N/N_K(number of uncorrelated) is
very very low in the output such as 1.53 out of 1894.
But for some few states, it is as low as 20 where alchemcial_analysis
uses the correlated data.
That's getting to be a low enough number that statistical errors may cause
differences between methods.
…On Mon, May 3, 2021 at 9:36 AM geraili91 ***@***.***> wrote:
I am sorry, I made a mistake in saying the number of uncorrelated data
points. It is usually for most of the states more than 50, but for some few
states, it is as low as 20 where alchemcial_analysis uses the correlated
data.
What do you mean by 1-3 data points?
With what you described BAR and MBAR difference, does it make sense to
consider the free energy correct when BAR and TI-3 match and MBAR is
somewhat different due to the fact that some states use correlated data?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#121 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABATPVG3SYPGNROX2AJOR7DTL27GLANCNFSM4KZADZFA>
.
|
I see this usually disagreement even in cases where I have enough uncorrelated data like the results below. How much of this disagreement is tolerable for a big 100 A system, Asn charged to Asn uncharged attached to a protein with explicit water with soft-core potential, normal? States TI (kcal/mol) TI-CUBIC (kcal/mol) BAR (kcal/mol) MBAR (kcal/mol) |
Really impossible to say without having an example to run. Something suspicious is going on, but there are too many potential suspects and we don't have enough information about the underlying data. |
The outputs of my file also have "*****" for many of the MBAR energies at lambda 1. |
run.log
<https://github.com/MobleyLab/alchemical-analysis/files/6416927/run.log>
So in this case,
BAR = -135.39 +- 0.09
MBAR = -134.76 +- 0.18
TI = -135.91 +- 0.14
This spread is definitely a little bit high but not drastically so -
(0.63/0.20 = 3 sigma between BAR and MBAR). It might be explained by the
lack of large numbers of correlated samples at some data points. I'd look
at the time series of dH/dL to see if anything weird is going on.
The outputs of my file also have "*****" for many of the MBAR energies at
lambda 1.
I'm not sure what file you are referring to. If AMBER, I don't really know
how AMBER handles the output formatting.
…On Mon, May 3, 2021 at 12:42 PM geraili91 ***@***.***> wrote:
run.log
<https://github.com/MobleyLab/alchemical-analysis/files/6416927/run.log>
The outputs of my file also have "*****" for many of the MBAR energies at
lambda 1.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#121 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABATPVEFFMOTA7XHCDJVKBTTL3U73ANCNFSM4KZADZFA>
.
|
Hi all,
I am mutating a residue of a protein using TI in AMBER, and outputting at the same time the MBAR data. In the end, I am using alchemical_analysis, but I have a problem. I know that usually for TI we consider the dhdl curve as the criteria, and in my simulations, they are for some parts (charging and decharging) good, and for vdW there are kinks, but the O_Matrix always has some empty squares or low values. It seems that it never works. My question is that how much the O_matrix is a reliable validation in this case? The standard deviation for three exactly the same simulation is close to 1 kcal/mol. I just wanted to know, what criteria might be the best to say that the free energy values are reliable?
Best
The text was updated successfully, but these errors were encountered: