Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Closes #113
GIL prevents several threads from executing Python bytecode in parallel, so if multiple threads are created, only one thread can execute Python bytecode at any given time.
Release the GIL
To temporarily release the GIL for the cpu-heavy index creation step pyo3's
allow_threads
seems to be enough to allow other Python threads to run and to get roughly x2+ improvement in multithreaded case, where number of threads is equal to the number of physical cores.Compare current situation on main branch without
allow_threads
, tested on my local machine with 6 threads (6 physical cores):To these benchmarks with
allow_threads
on index creation and different number of threads:Of course, amount of threads will still affect the results due to GIL's overall overhead, here as expected: less threads, better results.
Also, separately confirmed equal thread usage with
top
:GIL's switch interval affects too
CPU-bound tasks are especially limited by the GIL since they require continuous execution of Python bytecode and default GIL's switch interval is 5ms. To show this effect I added a benchmark with increased interval to 5 secs (the same 6 threads):
These results are much closer to the one-threaded case, than the ones with default GIL's switch interval.