The compatibility between Numba and Python vectorization #951
Replies: 5 comments
-
This is a great question. Let us investigate this. Pinging @anissa111 @rajeeja |
Beta Was this translation helpful? Give feedback.
-
In general, if you vectorize numpy, you end up running most things in the C layer and numba doesn't give you a computation speedup. However, numba will remove a lot of temporary memory copies, and reduce intermediate memory allocations. This brings a giant speedup usually. Since you've already made the choice to choose numba, I would optimize for readability and easy development first. And then use numba on the hot paths. |
Beta Was this translation helpful? Give feedback.
-
@dcherian I mostly agree with your comments and would like to keep numba as I have seen good improvements in the integrate functionality. There were a few versioning and compatibility issues with numba that caused a bit of pain. @hongyuchen1030 do you think we can match the optimization that numba provides with the changes you propose? |
Beta Was this translation helpful? Give feedback.
-
Since I didn't use python a lot and didn't do data analysis much either (I usually cope with C/C++ with algorithm implementations), I am not very sure about the details of numba and "numpy style vectorization" But according to my observation and numba documentation, numba is idea for looping-based data analysis: basically, if we want to do some iterative linear algebra function calls, numba is a good tool. However, the downside is: numba is built for numerical calculation, so it's not compatible with some complicated data structure (like a nested array), and we can only use the "numba-supported" numpy function (which are limited selections). From my knowledge and experience, numba doesn't support " My current algorithm implementation is all looping-based(good news for numba) but it also uses a lots of functionality that numba might not support. Although it's possible to vectorize some of them, there're still two things we need to be careful about:
|
Beta Was this translation helpful? Give feedback.
-
We have remove a bunch of Numba, the only places it remains is in grid/ geometry, neighbors and connectivity - all mostly non-nested functions. |
Beta Was this translation helpful? Give feedback.
-
I noticed that for the helper function in
uxarray/helpers.py
, we are using the Numba, which, according to its documentation, prefer codes writing using "non python" styles (like loop) and lack supports for some features (like nested array)Numba generates optimized machine code from pure Python code using the [LLVM compiler infrastructure](http://llvm.org/). With a few simple annotations, array-oriented and math-heavy Python code can be just-in-time optimized to performance similar as C, C++ and Fortran, without having to switch languages or Python interpreters.
However, at our upper levels like
uxarray/grid.py
, we are using the python vectorization to boost our performance: map functions, Ndarray manipulation, and so on, which seems like the opposite direction of the Numba.I wonder if these two methods can be compatible with each other, in other words, will the mixing of these two styles slow down the performance?
Beta Was this translation helpful? Give feedback.
All reactions