Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Precomputed lambda max #15

Open
duchesnay opened this issue Feb 14, 2017 · 8 comments
Open

Precomputed lambda max #15

duchesnay opened this issue Feb 14, 2017 · 8 comments

Comments

@duchesnay
Copy link
Collaborator

To avoid the re-computation of lambda max, while searching a grid of parameters we would like to provide pre-computed values of lambda max for both the data (X) and the linear operator (A).

First solution: provide lambda_max in the constructors of Estimators and Functions. The drawback is the large modifications of the API.

Second solution: provide helpers function, ie
precompute(X) and precompute(A) that computes the lambda max and eventually other quantities, and store it on the numpy objects X and A. This way, the lambda max will be propagated inside the loss and tv functions. There is no modification of the API, just a simple addition of code in the constructors of losses (eg. RidgeLogisticRegression, etc.) and penalties (eg. TotalVariation) to check the existence of such lambda max and to store it. Potentially, the precompute() function could add a decorator such lambda_max() to X and A numpy object instead of a simple additional attribute.

@tomlof
Copy link
Collaborator

tomlof commented Feb 14, 2017

This is a good idea!

Perhaps an alternative to the second option is the third option: to create some extended matrix class under parsimony.utils.linalgs, that inherits from numpy.ndarray, mirrors all its methods and fields (with appropriate use of the get-/setattr and -attribute that just passes everything on to the parent class) and that can expose the singular values of a matrix through e.g. a method get_singular_values(int), where the argument is the index of the singular value. E.g., get_singular_values(1) would return the largest singular value, get_singular_values(-1) would return the smallest singular value and get_singular_values() would return all singular values of the matrix. Changes to the matrix would of course have to invalidate the computed singular value(s).

Since lambda max is really a property of A'*A, and not of A, it would make more sense to use the largest singular value and then square it manually when needed, in order to obtain the lambda max.

This way the use of the extended matrix class would be completely unchanged with respect to everything else in pylearn-parsimony, and result in minimal changes when we need it. When we do, we can check and see if a given matrix is of the extended type and if so, use the precomputed values. If it is not of the extended matrix type, we would compute the lambda max as usual.

Further, would we need any other properties of matrices in the future, we can simply add these to the extended matrix class.

@duchesnay
Copy link
Collaborator Author

+1 for your proposition.

If I get your point correctly, you suggest using a "decorator" pattern rather than inheritance. This decorator will wrap any array using the get-/setattr and -attribute mechanism. Then we will define new methods get_singular_values(*) within this decorator.

I agree with this idea since decorator pattern will work for both sparse and non-sparse arrays. What could be the name of such decorator? Is it a general IT mechanisms to extend any array? I don't like too generic name such WarpArray. Or should we highlight the fact that we use it to store precomputed quantities? could be StoreArray (which still general). Any idea?

@tomlof
Copy link
Collaborator

tomlof commented Mar 6, 2017

I'm not sure I understand how to use decorators for this. Could you please explain how we could do it?

I thought we'd just inherit from numpy.ndarray and add the methods we need and to catch changes to the matrix contents. Adding a new custom type is perhaps not the best solution, however, so other ways to do it would be very useful! Another option would be to just have some function in utils (something like create_extended_matrix, that takes as input an array (of any numpy array type) and adds the things we need. Then we'd just check if the get_singular_values methods are available when we need them; use them if they are and compute the lambdas as we do today if they aren't. The benefit of this is that the matrix is still of a numpy type.

We need a way to know that the contents of the matrix has changed, however, in order to invalidate the computed lambda(s). That's why I proposed to wrap the attribute getters and setters. This may be tedious, however, since there are potentially many ways to change the contents of a numpy array in such a way that the singular values change. Perhaps there are better ways to detect it? I don't know the internals of numpy arrays well enough; perhaps there is already a way to know if it has changed? Do the internal buffer/memoryview have such features, i.e. to see if the buffer has been written to? Though we still need to know whether the shape has changed, and there are possibly other ways to invalidate the matrix that does not involve actually changing the element's values.

@duchesnay
Copy link
Collaborator Author

duchesnay commented Apr 11, 2017

The problem can be decomposed in two parts, (1) the API to get back pre-computed value within the core of the library and (2) the mechanism to store those values.

  1. This is the most important part since it impacts core of the library. However it is quite simple If we choose your proposition: Within parsimony code, given A (the linear operator) or X (the data), we check whether the call, of the method get_singular_values(1) do not fail. If it is the case we use the returned value. However,

  2. The exact implementation of the mechanism could be modified later without modification of the internals of parsimony as long as we respect item 1. However, the object X or A is not always an array. In the case of Nesterov A is a list. And in this case we already have a LinearOperator class (in utils.linalg) that inherits from the list. Therefore I will simply add a get_singular_values() method to LinearOperator that I will rename LinearOperatorNesterov to make it clearer. Since in the case of GraphNet, A is also a linear operator which is a unique large sparse matrix.

@tomlof
Copy link
Collaborator

tomlof commented Apr 24, 2017

This sounds like a great solution!

@duchesnay
Copy link
Collaborator Author

parsimony.utils.linalgs.LinearOperatorNesterov now implements get_singular_values(). Note LinearOperatorNesterov can be serialized on disk together with the singular values.

To get the get_singular_values() for GraphNet I simply stack the Nesterov one (used for TV then I add a get_singular_values() to it:

Atv = nesterov_tv.linear_operator_from_mesh(cor, tri, mask, calc_lambda_max=True)
Agn = sparse.vstack(Atv)
Agn.singular_values = Atv.get_singular_values()
def get_singular_values(self, nb=None):
    return self.singular_values[nb] if nb is not None else self.singular_values
Agn.get_singular_values = functools.partial(get_singular_values, Agn)

This works fine. However we need to integrate this few lines of code to the library. I see two possible designs:

  1. To add a to_graphnet() method to LinearOperatorNesterov that transforms the nesterov linear operator to the one require for graphnet penalty.
    The idea is to concentrate all the helpers function linear_operator_from_mesh or linear_operator_from_mask in a single place nesterov.tv linear operator for others penalties are just a trivial transformation of the one for TV.

  2. A better alternative would be to concentrate all the helpers functions into utils.linalg with an optional argument penalty in "tv", "gn"

The second choice require a bit of re-factoring but it provides a clearer API.

@tomlof
Copy link
Collaborator

tomlof commented Jun 1, 2017

This sounds great!

Perhaps a third option is to inherit from LinearOperatorNesterov to obtain LinearOperatorTV, which is just a copy of its parent, and a second child class LinearOperatorGN, that performs the necessary steps to obtain the GraphNet operator?

@duchesnay
Copy link
Collaborator Author

+1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants