basis set computation doesn't seem to work right #8

franknoe · 2015-07-30T13:55:37Z

This code:

import variational
from variational.basissets.ramachandran import RamachandranBasis
alabasis = RamachandranBasis('A', radians=False)
import numpy as np
atraj = np.array([[-120, 60],[120, 120]])
alabasis.map(atraj)

leads to this output:

array([[ 1.        ,  0.2007158 , -0.79413052],
       [-0.        ,  0.        , -0.        ]])

which can't be right. The last row shouldn't be zero. At least the first column must always be 1.0

The text was updated successfully, but these errors were encountered:

fvitalini · 2015-07-30T14:27:49Z

Hi Frank,

no it does make sense.
The basis functions of the capped amino acids are evaluated on a 36X36 grid, but not all of the microstates are actually populated.

The micro state corresponding to the phi/psi combination [120,120] it is simply never visited.

What I have tested where the functions within ramachandran.py.
I checked that by providing an np-array containing the phi/psi time series of all residues, the function would construct a matrix containing the the trajectory projected onto the basis functions and that it would produce the same results as my old code.

Francesca

fnueske · 2015-07-30T14:42:54Z

This is a non-trivial point isn't it? The single amino-acid eigenvectors
are undefined for unpopulated states, but in principle, these states
might show up in simulations of more complicated systems. Francesca,
have you encountered this before?
We can at least modify the first eigenvector to be equal to one everywhere.

Am 30.07.15 um 16:27 schrieb fvitalini:

Hi Frank,

no it does make sense.
The basis functions of the capped amino acids are evaluated on a 36X36
grid, but not all of the microstates are actually populated.

ac_a_nhme_rev_0
https://cloud.githubusercontent.com/assets/13469315/8985400/7e792fd4-36d7-11e5-99e5-8ab21dd7fb85.jpg

The micro state corresponding to the phi/psi combination [120,120] it
is simply never visited.

What I have tested where the functions within ramachandran.py.
I checked that by providing an np-array containing the phi/psi time
series of all residues, the function would construct a matrix
containing the the trajectory projected onto the basis functions and
that it would produce the same results as my old code.

Francesca

—
Reply to this email directly or view it on GitHub
#8 (comment).

franknoe · 2015-07-30T14:53:51Z

I agree

Am 30/07/15 um 16:42 schrieb Feliks Nüske:

This is a non-trivial point isn't it? The single amino-acid eigenvectors
are undefined for unpopulated states, but in principle, these states
might show up in simulations of more complicated systems. Francesca,
have you encountered this before?
We can at least modify the first eigenvector to be equal to one
everywhere.

Am 30.07.15 um 16:27 schrieb fvitalini:

Hi Frank,

no it does make sense.
The basis functions of the capped amino acids are evaluated on a 36X36
grid, but not all of the microstates are actually populated.

ac_a_nhme_rev_0

https://cloud.githubusercontent.com/assets/13469315/8985400/7e792fd4-36d7-11e5-99e5-8ab21dd7fb85.jpg

The micro state corresponding to the phi/psi combination [120,120] it
is simply never visited.

What I have tested where the functions within ramachandran.py.
I checked that by providing an np-array containing the phi/psi time
series of all residues, the function would construct a matrix
containing the the trajectory projected onto the basis functions and
that it would produce the same results as my old code.

Francesca

—
Reply to this email directly or view it on GitHub

#8 (comment).

—
Reply to this email directly or view it on GitHub
#8 (comment).

Prof. Dr. Frank Noe
Head of Computational Molecular Biology group
Freie Universitaet Berlin

Phone: (+49) (0)30 838 75354
Web: research.franknoe.de

Mail: Arnimallee 6, 14195 Berlin, Germany

franknoe · 2015-07-30T14:56:44Z

Please still provide an example (a trajectory chunk) and demonstrate the
use of

Single Ramachandran Basis
Product Basis
Estimating correlation matrix from the result
Solving the generalized eigenproblem

Each of those exclusively using code from variational, and each should
just be a few lines of code

I guess that's to both Francesca and Feliks

Am 30/07/15 um 16:27 schrieb fvitalini:

Hi Frank,

no it does make sense.
The basis functions of the capped amino acids are evaluated on a 36X36
grid, but not all of the microstates are actually populated.

ac_a_nhme_rev_0
https://cloud.githubusercontent.com/assets/13469315/8985400/7e792fd4-36d7-11e5-99e5-8ab21dd7fb85.jpg

The micro state corresponding to the phi/psi combination [120,120] it
is simply never visited.

What I have tested where the functions within ramachandran.py.
I checked that by providing an np-array containing the phi/psi time
series of all residues, the function would construct a matrix
containing the the trajectory projected onto the basis functions and
that it would produce the same results as my old code.

Francesca

—
Reply to this email directly or view it on GitHub
#8 (comment).

Prof. Dr. Frank Noe
Head of Computational Molecular Biology group
Freie Universitaet Berlin

Phone: (+49) (0)30 838 75354
Web: research.franknoe.de

Mail: Arnimallee 6, 14195 Berlin, Germany

fnueske · 2015-07-30T15:01:27Z

Ok, but today I don't have the time. I'll try tomorrow, ok?

Am 30.07.15 um 16:53 schrieb Frank Noe:

I agree

Am 30/07/15 um 16:42 schrieb Feliks Nüske:

This is a non-trivial point isn't it? The single amino-acid eigenvectors
are undefined for unpopulated states, but in principle, these states
might show up in simulations of more complicated systems. Francesca,
have you encountered this before?
We can at least modify the first eigenvector to be equal to one
everywhere.

Am 30.07.15 um 16:27 schrieb fvitalini:

Hi Frank,

no it does make sense.
The basis functions of the capped amino acids are evaluated on a 36X36
grid, but not all of the microstates are actually populated.

ac_a_nhme_rev_0

https://cloud.githubusercontent.com/assets/13469315/8985400/7e792fd4-36d7-11e5-99e5-8ab21dd7fb85.jpg

The micro state corresponding to the phi/psi combination [120,120] it
is simply never visited.

What I have tested where the functions within ramachandran.py.
I checked that by providing an np-array containing the phi/psi time
series of all residues, the function would construct a matrix
containing the the trajectory projected onto the basis functions and
that it would produce the same results as my old code.

Francesca

—
Reply to this email directly or view it on GitHub

#8 (comment).

—
Reply to this email directly or view it on GitHub

#8 (comment).

Prof. Dr. Frank Noe
Head of Computational Molecular Biology group
Freie Universitaet Berlin

Phone: (+49) (0)30 838 75354
Web: research.franknoe.de

Mail: Arnimallee 6, 14195 Berlin, Germany

—
Reply to this email directly or view it on GitHub
#8 (comment).

fvitalini · 2015-07-30T15:01:49Z

Hi,

The microstates where the first eigenvector is zero are states that are not part of the largest connected set in the MSM of the amino acid.
Theoretically it is true that the same amino acid in a different sequence might have a “slightly” different distribution.
However, the hypothesis at the basis of such basis set definition is that the differences in the dynamics of X between Ac-X-NHMe and Y-X-Z should be irrelevant.
The basis functions I have used for the paper have zeros for those microstates that are not visited by the trajectory.

I have encountered already a case where there was an obvious difference between the capped amino acid and the amino acid in the sequence.
For example, Alanine’s distribution in Ac-AP-NHMe is very different from Ac-A-NHMe. We ended up defining a new basis function in that case.
I haven’t checked if any of the other amino acids populates states that are not populated in the corresponding residue-based functions, but this has not been an issue for me so far.

I will provide an example on how to use the functions "Single Ramachandran Basis” and "Product Basis”. Is it ok if I add a folder, e.g. EXAMPLE, and inside provide scripts and files to try the functions?

Francesca

Il giorno 30/lug/2015, alle ore 16:42, Feliks Nüske [email protected] ha scritto:

This is a non-trivial point isn't it? The single amino-acid eigenvectors
are undefined for unpopulated states, but in principle, these states
might show up in simulations of more complicated systems. Francesca,
have you encountered this before?
We can at least modify the first eigenvector to be equal to one everywhere.

Am 30.07.15 um 16:27 schrieb fvitalini:

Hi Frank,

no it does make sense.
The basis functions of the capped amino acids are evaluated on a 36X36
grid, but not all of the microstates are actually populated.

ac_a_nhme_rev_0
https://cloud.githubusercontent.com/assets/13469315/8985400/7e792fd4-36d7-11e5-99e5-8ab21dd7fb85.jpg

The micro state corresponding to the phi/psi combination [120,120] it
is simply never visited.

What I have tested where the functions within ramachandran.py.
I checked that by providing an np-array containing the phi/psi time
series of all residues, the function would construct a matrix
containing the the trajectory projected onto the basis functions and
that it would produce the same results as my old code.

Francesca

—
Reply to this email directly or view it on GitHub
#8 (comment).

—
Reply to this email directly or view it on GitHub.

franknoe · 2015-07-30T15:03:00Z

sure

Am 30/07/15 um 17:01 schrieb Feliks Nüske:

Ok, but today I don't have the time. I'll try tomorrow, ok?

Am 30.07.15 um 16:53 schrieb Frank Noe:

I agree

Am 30/07/15 um 16:42 schrieb Feliks Nüske:

This is a non-trivial point isn't it? The single amino-acid
eigenvectors
are undefined for unpopulated states, but in principle, these states
might show up in simulations of more complicated systems. Francesca,
have you encountered this before?
We can at least modify the first eigenvector to be equal to one
everywhere.

Am 30.07.15 um 16:27 schrieb fvitalini:

Hi Frank,

no it does make sense.
The basis functions of the capped amino acids are evaluated on a
36X36
grid, but not all of the microstates are actually populated.

ac_a_nhme_rev_0

https://cloud.githubusercontent.com/assets/13469315/8985400/7e792fd4-36d7-11e5-99e5-8ab21dd7fb85.jpg

The micro state corresponding to the phi/psi combination
[120,120] it
is simply never visited.

What I have tested where the functions within ramachandran.py.
I checked that by providing an np-array containing the phi/psi time
series of all residues, the function would construct a matrix
containing the the trajectory projected onto the basis functions and
that it would produce the same results as my old code.

Francesca

—
Reply to this email directly or view it on GitHub

#8 (comment).

—
Reply to this email directly or view it on GitHub

#8 (comment).

Prof. Dr. Frank Noe
Head of Computational Molecular Biology group
Freie Universitaet Berlin

Phone: (+49) (0)30 838 75354
Web: research.franknoe.de

Mail: Arnimallee 6, 14195 Berlin, Germany

—
Reply to this email directly or view it on GitHub

#8 (comment).

—
Reply to this email directly or view it on GitHub
#8 (comment).

Prof. Dr. Frank Noe
Head of Computational Molecular Biology group
Freie Universitaet Berlin

Phone: (+49) (0)30 838 75354
Web: research.franknoe.de

Mail: Arnimallee 6, 14195 Berlin, Germany

franknoe · 2015-07-30T15:07:44Z

Am 30/07/15 um 17:01 schrieb fvitalini:

Hi,

The microstates where the first eigenvector is zero are states that
are not part of the largest connected set in the MSM of the amino acid.
Theoretically it is true that the same amino acid in a different
sequence might have a “slightly” different distribution.
However, the hypothesis at the basis of such basis set definition is
that the differences in the dynamics of X between Ac-X-NHMe and Y-X-Z
should be irrelevant.
If you encounter a new system that visits points that have not been
visited in your parametrization, one still needs to do something
reasonable with them. At the least the first column must be 1, otherwise
subsequent algorithms such as Feliks' one will simply break down.
But also for the other columns I think we have to do some reasonable
interpolation.

I'm sure that in large peptides or proteins you will not only have
slight differences, but you can lock amino acids in phi/psi values that
are practically forbidden for separate amino acids. So this is an issue.

The basis functions I have used for the paper have zeros for those
microstates that are not visited by the trajectory.

I have encountered already a case where there was an obvious
difference between the capped amino acid and the amino acid in the
sequence.
For example, Alanine’s distribution in Ac-AP-NHMe is very different
from Ac-A-NHMe. We ended up defining a new basis function in that case.
I haven’t checked if any of the other amino acids populates states
that are not populated in the corresponding residue-based functions,
but this has not been an issue for me so far.

I will provide an example on how to use the functions "Single
Ramachandran Basis” and "Product Basis”. Is it ok if I add a folder,
e.g. EXAMPLE, and inside provide scripts and files to try the functions?
OK, add such a folder examples at the top level of the repository.
If you add data, again make sure to use binary data, and ideally compressed.

Francesca

Il giorno 30/lug/2015, alle ore 16:42, Feliks Nüske
[email protected] ha scritto:

This is a non-trivial point isn't it? The single amino-acid
eigenvectors
are undefined for unpopulated states, but in principle, these states
might show up in simulations of more complicated systems. Francesca,
have you encountered this before?
We can at least modify the first eigenvector to be equal to one
everywhere.

Am 30.07.15 um 16:27 schrieb fvitalini:

Hi Frank,

no it does make sense.
The basis functions of the capped amino acids are evaluated on a
36X36
grid, but not all of the microstates are actually populated.

ac_a_nhme_rev_0

https://cloud.githubusercontent.com/assets/13469315/8985400/7e792fd4-36d7-11e5-99e5-8ab21dd7fb85.jpg

The micro state corresponding to the phi/psi combination [120,120] it
is simply never visited.

What I have tested where the functions within ramachandran.py.
I checked that by providing an np-array containing the phi/psi time
series of all residues, the function would construct a matrix
containing the the trajectory projected onto the basis functions and
that it would produce the same results as my old code.

Francesca

—
Reply to this email directly or view it on GitHub

#8 (comment).

—
Reply to this email directly or view it on GitHub.

—
Reply to this email directly or view it on GitHub
#8 (comment).

Prof. Dr. Frank Noe
Head of Computational Molecular Biology group
Freie Universitaet Berlin

Phone: (+49) (0)30 838 75354
Web: research.franknoe.de

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

basis set computation doesn't seem to work right #8

basis set computation doesn't seem to work right #8

franknoe commented Jul 30, 2015

fvitalini commented Jul 30, 2015

fnueske commented Jul 30, 2015

franknoe commented Jul 30, 2015

franknoe commented Jul 30, 2015

fnueske commented Jul 30, 2015

Mail: Arnimallee 6, 14195 Berlin, Germany

fvitalini commented Jul 30, 2015

franknoe commented Jul 30, 2015

Mail: Arnimallee 6, 14195 Berlin, Germany

franknoe commented Jul 30, 2015

basis set computation doesn't seem to work right #8

basis set computation doesn't seem to work right #8

Comments

franknoe commented Jul 30, 2015

fvitalini commented Jul 30, 2015

fnueske commented Jul 30, 2015

franknoe commented Jul 30, 2015

Mail: Arnimallee 6, 14195 Berlin, Germany

franknoe commented Jul 30, 2015

Mail: Arnimallee 6, 14195 Berlin, Germany

fnueske commented Jul 30, 2015

Mail: Arnimallee 6, 14195 Berlin, Germany

fvitalini commented Jul 30, 2015

franknoe commented Jul 30, 2015

Mail: Arnimallee 6, 14195 Berlin, Germany

Mail: Arnimallee 6, 14195 Berlin, Germany

franknoe commented Jul 30, 2015

Mail: Arnimallee 6, 14195 Berlin, Germany