-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bugfix in matrix multiplication involving adj/trans #360
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #360 +/- ##
=======================================
Coverage 99.90% 99.90%
=======================================
Files 8 8
Lines 1043 1057 +14
=======================================
+ Hits 1042 1056 +14
Misses 1 1 ☔ View full report in Codecov by Sentry. |
end | ||
|
||
for T in (:Adjoint, :Transpose) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These methods are moved down to line 296, so that methods for both the orderings are together.
mul!(colC, A, colB, alpha, beta) | ||
end | ||
C | ||
_mulfill!(C, A, B, alpha, beta) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use a matrix-matrix multiplication instead of a sequence of matrix-vector multiplications.
end | ||
mul!(_firstcol(C), A, view(B, :, 1), alpha, beta) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is correct only for beta = 0
, and the other branch is new in this PR
The current implementation evaluated the matrix multiplication by computing the first column, and then copying it over to the other columns. This was incorrect for the 5-term case with a non-zero beta, in which the destination was being overwritten instead of being added to. This PR fixes this. To evaluate the 5-term multiplication efficiently, we now allocate one row/column of
A*B*alpha
, and broadcast this overC
. This leads toO(N)
allocations inmul!
, but reduces the time complexity fromO(N^3)
toO(N^2)
. To me, this compromise seems reasonable. There is no allocation necessary in the 3-term multiplication case, soA*B
is not impacted by this.We now also have matrix-multiplication implementations for both the orderings
mul!(C, F::Fill, B, alpha, beta)
andmul!(C, A, F::Fill, alpha, beta)
. We implement this by specializing the internal function_mulfill!
. This way, we don't need to computemul!(C, F, B', alpha, beta)
asmul!(C', B, F', alpha, beta)
anymore. The latter may be incorrect for non-commuting numbers such as quaternions. The implementation in this PR is more general and makes no assumptions, so this should be correct for all element types.Performance: