-
-
Notifications
You must be signed in to change notification settings - Fork 127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Update MLIR backend to LLVM 20.dev #799
Conversation
CodSpeed Performance ReportMerging #799 will degrade performances by 24.56%Comparing Summary
Benchmarks breakdown
|
e3204d0
to
a20ae89
Compare
ed208b8
to
49cfb20
Compare
We still need more wheels for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the test changes, overall the essence of this change LGTM, it is mlir
-> Finch_mlir
and figuring out the COO format better. Thanks!
singleton_counter += 1 | ||
fields.append( | ||
( | ||
f"indices_{compressed_counter}_coords_{singleton_counter}", | ||
get_nd_memref_descr(1, idx_dtype), | ||
) | ||
) | ||
else: | ||
fields.append((f"indices_{compressed_counter}", get_nd_memref_descr(1, idx_dtype))) | ||
|
||
if LevelFormat.Singleton == level.format: | ||
singleton_counter += 1 | ||
fields.append( | ||
(f"indices_{compressed_counter}_coords_{singleton_counter}", get_nd_memref_descr(1, idx_dtype)) | ||
) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably handle SOA and without SOA separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my opinion we should only support SoA singleton format:
- Non-SoA singleton looks to be buggy for basic operations link
- Mixed singleton levels aren't allowed: https://github.com/llvm/llvm-project/blob/8d38fbf2f027c72332c8ba03ff0ff0f83b4dcf02/mlir/lib/Dialect/SparseTensor/IR/SparseTensorDialect.cpp#L811
What would be the benefit of supporting non-SoA singleton levels separately?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'd be able to support the current COO format only for non-SoA.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by that? Can you give some example?
With this PR we can support COO format (also input objects of type scipy.sparse.coo_array
) where MLIR-backend implementation uses SoA representation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The existing Numba COO
format uses the non-SoA format, and we pretty much have to support this to be backwards compatible. Doesn't have to be in this PR though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'm still missing the point here. Numba is a separate backend that supports only 1D and 2D COO arrays. MLIR backend supports >=2D COO arrays.
What do you mean by backward compatibility here? Can you give an example where backward compatibility breaks here? If a user passes scipy.sparse.coo_array
object to sparse.asarray
function it it will work for any backend regardless of an internal representation (SoA or non-SoA).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, sparse.COO
has a constructor that takes coords
and a .coords
attribute. The attribute is a 2D NumPy array.
Ideally, I'd like to keep it a 2D np.ndarray
as otherwise I'm not 100% sure how much will break.
We could do this with np.stack
, but that would incur a performance penalty.
Also the current Numba backend supports ND, not 2D. SciPy supports only 2D, however.
I'm thinking of a future where all of sparse is powered by Finch-MLIR, ideally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for accessing coords
, np.stack
could be an option. My caveat for non-SoA in MLIR backend is that it already has some issues with basic operations: https://discourse.llvm.org/t/passmanager-fails-on-simple-coo-addition-example/81247
1dca588
to
57ca082
Compare
57ca082
to
d511c5c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one comment plus CI/Windows and should be good to go.
This PR replaces #787 and fixes #797
This PR updates MLIR backend to current LLVM 20.dev (so
main
branch):tensor.empty
call link.As one can see it fixes a bunch of skips in the test suite: sparse/mlir_backend/tests/test_simple.py
It's thanks to changes already present in
main
branch, added after19.x
branched, and:soa
property tosparse_tensor
Python bindings llvm/llvm-project#109135encoding
argument totensor.empty
Python function llvm/llvm-project#110656test_output.py
test llvm/llvm-project#110882