-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sdm.dot_product_mkl ends with segmentation fault for sparse-sparse multiplication #30
Comments
I can replicate this 100% with the code and files provided, and the segfault happens internal to I fix this 100% by copying the loaded object once before passing them into the multiplication. Does this problem always happen after deserializing data from files?
|
Thanks for your quick response.
If I tried to modify the matrix and save it again. Then no segfault for A_new.T * B.T In our use case we don't save the matrix into files and segfault still happens. (so we try to dump the matrix and see what it happens) |
I'll see what I can do. Unfortunately, even though I can replicate it 100% of the time, it's not occurring when I run it with valgrind. Copying the indices ( |
Thanks a lot. Maybe we will tentatively use the copy trick to bypass this issue. Hope that one day you may find out the reason. Thanks. |
Sadly in our use case the segfault still happens for other matrices even |
With gdb I can see that it's segfaulting in the same place in Helgrind suggests that there's a race condition in
I suspect that converting the CSC to a CSR in python ahead of time would fix this. |
Hi, thanks for your packages for the mkl python interface. It's quite useful.
However we encounter some unexpected results when performing sparse-sparse matrix multiplication.
It sometimes leads to a segmentation fault.
A minimum code snippet to reproduce the bug:
Please first download the following two matrices (the bug only appears for certain matrices)
A.npz: https://drive.google.com/file/d/1NRT8SchOS3XefZokbFOpqJw6CIygTEQ-
B.npz: https://drive.google.com/file/d/1aFDa2BbNQRGmmlAceIjK4JoogVQfKJY_/
If the first line is called, it will cause a segmentation fault.
However, if the second line is called, the segmentation fault does not happen.
We also try to first transform A matrix into coo format, or print(A) prior to the matrix multiplication, and the segmentation fault won't happen either.
However it is quite uncomfortable because we didn't find the exact reason for that.
So we turn to your help for this. Thank you in advance.
(We tried this on multiple machines and for this example it always happens)
The text was updated successfully, but these errors were encountered: