Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Block Matrix Linear Operator #67

Open
wants to merge 26 commits into
base: main
Choose a base branch
from

Conversation

corwinjoy
Copy link
Contributor

Idea

Represent [TN, TM] tensors by TxT blocks of NxM lazy tensors. While block matrices are currently supported, the efficient representation is only when there is a diagonal structure over the T dimensions.

Pitch

Add a block linear operator class that can keep track of the [T, T] block structure, represented as T^2 lazy tensors of the same shape. Implement matrix multiplication between block matrices as the appropriate linear operators on the blocks.

Previous Discussion

Issue #54

Additional Considerations

In pursuing this, it seems that the base test class checks for many operations beyond what is required to create a LinearOperator. I propose a refactoring of the test class into required / core operations and optional operations. For now, I have created a new core test class CoreLinearOperatorTestCase and have shown what has been excluded by commenting out the relevant code. This idea could also use a review for accuracy.

@corwinjoy
Copy link
Contributor Author

@Balandat After more discussion with @hughsalimbeni and @dannyfriar here is our initial proposal to expand the library as per the previous discussion.
We think it represents a good way to generalize the operator structure to capture nested operations.
We also look forward to any suggestions you may have.
Also tagging: @gpleiss, @jacobrgardner

@corwinjoy
Copy link
Contributor Author

Hi, I just wanted to check in and see if you had time to take a look at our proposal and if you had any thoughts? Thanks again for the feedback so far!
@Balandat @gpleiss

@gpleiss
Copy link
Member

gpleiss commented Jul 15, 2023

Sorry for the delay @corwinjoy - this PR is on my todo list for Monday.

@corwinjoy
Copy link
Contributor Author

@gpleiss Thanks for all the great feedback! I believe I have addressed all these and would appreciate a second look when you have time.

@gpleiss
Copy link
Member

gpleiss commented Jul 27, 2023

@corwinjoy after playing around with this PR some more, and dealing with the typeguard errors, I made 2 commits that change around some of the internals. The first is hopefully not to controversial, the second one might be :)

  1. Rearranging the logic of _matmul. The purpose of the _matmul function is defining a matrix-vector product that can be used by iterative algorithms (e.g. CG), so - upon further reflection - I think that _matmul should only accept Tensor rhs, and it should also output a Tensor as well.

I rearranged some of the logic, in what I hope won't affect anything that you've already written.

  • The previous base case of _matmul now lives in a private function called _matmul_two_block_matrix_linear_operators, which takes in two BlockMatrixLOs and outputs a BlockMatrixLO
  • The public matmul function now has logic to call this new private function if other is an appropriate BlockMatrixLO. Otherwise it calls the _matmul function.
  1. Changing the constructor. (slightly more controversial). There were some typing issues with the constructor, since it was taking in a list of lists of linear operators. The arguments to all other LinearOperators are expected to be linear operators or Tensors. (I noticed that you had to overwrite the representation function to accommodate this). There are unfortunately a lot of (necessary) hacks throughout LinearOperator that do expect the arguments to be LinearOperators and Tensors, and I'm a bit nervous that there could be some unintended bugs by not adhering to this pattern. (Historically, these bugs emerge as gradient issues or type-casting issues.)

My proposed solution: the user passes in a flattened list of T^2 linear operators into the constructor. It's a little more unintuitive for the user (passing in a list of lists is more natural, I agree), but it actually made some of the code quite a bit simpler.

Anyways, both commits are now part of this branch. Let me know your thoughts, and we can always revert and figure out different solutions.

@corwinjoy
Copy link
Contributor Author

corwinjoy commented Jul 28, 2023

@gpleiss Thanks for the detailed review l. It all sounds good except for possibly the constructor which I need to think about. I am traveling right now but should be able to take a detailed look by Wednesday. @dannyfriar any thoughts? @hughsalimbeni

@dannyfriar
Copy link
Contributor

@gpleiss Thanks for the detailed review l. It all sounds good except for possibly the constructor which I need to think about. I am traveling right now but should be able to take a detailed look by Wednesday. @dannyfriar any thoughts? @hughsalimbeni

I also think that passing in a list of lists is more natural. One thing we'd like to do with this is have the ability to rotate the blocks - this is simpler with the list of lists but I'm sure it can be made to work with a flat list too. @gpleiss out of curiosity what are the hacks that you mentioned that require LOs/tensors?

@corwinjoy
Copy link
Contributor Author

@gpleiss Thanks for the code changes! Looking at these, they make sense. I like the better matmul function. The flattened list of operators for the constructor is not my first choice, but if that is what needs to happen for compatibility I can live with it. We do have a from_tensor helper constructor already and if we can add a nested list helper later if we find out we need it. Like @dannyfriar I am a bit curious about the hacks that require this format if you do have time to explain. At any rate, I am happy with the changes as they are.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants