-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sparse versions of repeat for matrices and vectors #532
Conversation
The col_length function argument was pretty confusing, as it expected not the number of nonzero elements in the column, but rather one less than that. This moves this offset by one to the actual for-loop that it is for, which clarifies the meaning and also simplifies callsites.
Instead of taking the whole input matrix, only take the actuallly required parts (row indices and nonzero values). That allows using it in more contexts.
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #532 +/- ##
==========================================
+ Coverage 76.42% 84.07% +7.64%
==========================================
Files 12 12
Lines 8969 9068 +99
==========================================
+ Hits 6855 7624 +769
+ Misses 2114 1444 -670 ☔ View full report in Codecov by Sentry. |
was |
Are you referring to |
@SobhanMP Can you add your review to this PR? |
src/sparsematrix.jl
Outdated
@@ -3908,27 +3908,24 @@ function vcat(X::AbstractSparseMatrixCSC...) | |||
ptr_res = colptr[c] | |||
for i = 1 : num | |||
colptrXi = getcolptr(X[i]) | |||
col_length = (colptrXi[c + 1] - 1) - colptrXi[c] | |||
rowvalXi = rowvals(X[i]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think at some point we agreed not to create temporary variables for this. Use rowvals(X[i])
directly as it gets optimized away by the compiler
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed by b961a7f. I kept the getcolptr
stuff in vcat
as it was before, to avoid touching lines unnecessarily. Let me know if you disagree and would like it changed by this PR too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since it's jus style, let's keep the pull request focused on just repeat
Just use the accessor functions rowvals, nonzeros, getcolptr whenever needed instead. Especially in the sparse vector case avoid using findnz, which creates a copy of the data. Use nonzeroinds and nonzeros instead.
Simplifies updating the insert position at the call site.
Thanks for the review! Anything left that I can do to help moving this forward? |
@SobhanMP Good to merge? |
yes, sorry I was behind a deadline 😅 @mjacobse, thanks for the code. |
Calling
repeat
for sparse matrices or vectors currently calls theBase
function, which ends up building the result rather inefficiently with indexing brackets.This would add sparse versions that use the info that inputs are in CSC format to provide more efficient implementations. To keep it simple it is limited to the straightforward cases of outer repetition along the first two dimensions. Row-wise repetition is intentionally kept close to the implementation for
vcat
, also using its helper functionstuffcol!
(with slight modifications). Column-wise repetition is kept simple by deferring toBase.repeat
on thecolptr
,rowval
, andnzval
components of the CSC format.