-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Precomputed lambda max #15
Comments
This is a good idea! Perhaps an alternative to the second option is the third option: to create some extended matrix class under parsimony.utils.linalgs, that inherits from numpy.ndarray, mirrors all its methods and fields (with appropriate use of the get-/setattr and -attribute that just passes everything on to the parent class) and that can expose the singular values of a matrix through e.g. a method get_singular_values(int), where the argument is the index of the singular value. E.g., get_singular_values(1) would return the largest singular value, get_singular_values(-1) would return the smallest singular value and get_singular_values() would return all singular values of the matrix. Changes to the matrix would of course have to invalidate the computed singular value(s). Since lambda max is really a property of A'*A, and not of A, it would make more sense to use the largest singular value and then square it manually when needed, in order to obtain the lambda max. This way the use of the extended matrix class would be completely unchanged with respect to everything else in pylearn-parsimony, and result in minimal changes when we need it. When we do, we can check and see if a given matrix is of the extended type and if so, use the precomputed values. If it is not of the extended matrix type, we would compute the lambda max as usual. Further, would we need any other properties of matrices in the future, we can simply add these to the extended matrix class. |
+1 for your proposition. If I get your point correctly, you suggest using a "decorator" pattern rather than inheritance. This decorator will wrap any array using the get-/setattr and -attribute mechanism. Then we will define new methods get_singular_values(*) within this decorator. I agree with this idea since decorator pattern will work for both sparse and non-sparse arrays. What could be the name of such decorator? Is it a general IT mechanisms to extend any array? I don't like too generic name such WarpArray. Or should we highlight the fact that we use it to store precomputed quantities? could be StoreArray (which still general). Any idea? |
I'm not sure I understand how to use decorators for this. Could you please explain how we could do it? I thought we'd just inherit from We need a way to know that the contents of the matrix has changed, however, in order to invalidate the computed lambda(s). That's why I proposed to wrap the attribute getters and setters. This may be tedious, however, since there are potentially many ways to change the contents of a numpy array in such a way that the singular values change. Perhaps there are better ways to detect it? I don't know the internals of numpy arrays well enough; perhaps there is already a way to know if it has changed? Do the internal buffer/memoryview have such features, i.e. to see if the buffer has been written to? Though we still need to know whether the shape has changed, and there are possibly other ways to invalidate the matrix that does not involve actually changing the element's values. |
The problem can be decomposed in two parts, (1) the API to get back pre-computed value within the core of the library and (2) the mechanism to store those values.
|
This sounds like a great solution! |
To get the
This works fine. However we need to integrate this few lines of code to the library. I see two possible designs:
The second choice require a bit of re-factoring but it provides a clearer API. |
This sounds great! Perhaps a third option is to inherit from |
+1 |
To avoid the re-computation of lambda max, while searching a grid of parameters we would like to provide pre-computed values of lambda max for both the data (X) and the linear operator (A).
First solution: provide lambda_max in the constructors of Estimators and Functions. The drawback is the large modifications of the API.
Second solution: provide helpers function, ie
precompute(X)
andprecompute(A)
that computes the lambda max and eventually other quantities, and store it on the numpy objectsX
andA
. This way, the lambda max will be propagated inside the loss and tv functions. There is no modification of the API, just a simple addition of code in the constructors of losses (eg. RidgeLogisticRegression, etc.) and penalties (eg. TotalVariation) to check the existence of such lambda max and to store it. Potentially, theprecompute()
function could add a decorator suchlambda_max()
toX
andA
numpy object instead of a simple additional attribute.The text was updated successfully, but these errors were encountered: