-
-
Notifications
You must be signed in to change notification settings - Fork 212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: increase efficiency by using coroutines #322
base: dev
Are you sure you want to change the base?
Conversation
So everything is still in order and single threaded, except when each patch is resolving a fingerprint it's that single fingerprint resolving that runs in parallel? This should prevent all concurrency issues and the patches code won't need any changes or need to care about writing proper multi-threaded code. What is the resource patch data race issue? |
I'm trying this locally, but it looks like the patches are running in parallel? I'm seeing the patches are not finishing in alphabetical order. If the patching is done in parallel then patching errors can occur non-deterministically (ie: at random), because multiple patches can modify the same method in an interleaving fashion. Where one patch is getting an insert index, and then another patch adds code that changes that insertion index, and the first patch then uses the wrong index which is now pointing to the wrong instruction. I'm only seeing a minor speedup with this change. Without this change, it takes 70 seconds total patching time from start to finish (including installing the patched app). 5 seconds of improvement is less than a 10% improvement and seems too little to notice. Slower devices would see a larger speedup, but it still would not be substantial. |
An unrelated idea to speed up patching, is to use KMP for the opcode pattern matching. KMP is ideal since it changes searching to O(N) instead of the current worse case O(N^2). But it really shines when the search target has a limited set of characters and many repeating patterns, such as the repeating and common opcode patterns generated by the compiler. That alone would most likely give even better performance than trying to parallelize patching. |
The patcher has a coroutine-aware lock to prevent execution of other bytecode patches until the current one has finished. Resource patches are still being executed on multiple threads, but that will be changed later. On my PC, patch execution time goes from 9,5 seconds to about 5,6. Phones will probably benefit more from this, although I have not tested it yet. |
This PR contains ideas I got while looking at discussions and code related to the
perf/parallel-patch-execution
branch.Changes
TODOs