-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Numba implementation of Blockwise #1015
Conversation
8f514ac
to
2720d0b
Compare
2014cd9
to
c45aab2
Compare
c45aab2
to
31cc1e9
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1015 +/- ##
==========================================
+ Coverage 82.09% 82.10% +0.01%
==========================================
Files 183 185 +2
Lines 48010 48089 +79
Branches 8653 8659 +6
==========================================
+ Hits 39412 39485 +73
- Misses 6435 6439 +4
- Partials 2163 2165 +2
|
787463a
to
ddba936
Compare
Getting a failure in the CI that I don't reproduce locally: Any idea @jessegrabowski @aseyboldt @lucianopaz (sorry for tag spam) |
Maybe locally you end up using a different multiprocessing context? It could be that the pytests run in fork mode and locally it runs in forkserver for some reason? |
May be also a numba/scipy version difference |
ddba936
to
ad15c54
Compare
Oh the failure is on the python3.10 CI, it passes on python3.12. numba and scipy versions are equivalent. |
The problems seems to be the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks great
This can only be done when the output of infer_shape of the core_op depends only on the input shapes, and not their values.
d0b9fc2
to
661408a
Compare
It uses the machinery developed for RVs and Elemwise. The hard part has to do with multiple number of inputs and numba fussiness.
It also improves Blockwise shape inference based on the infer_shape of the core ops
The small cholesky benchmark I added here test runs 10x faster after this PR on my local machine.
Related Issue