Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: Execute patches in parallel #321

Closed
wants to merge 6 commits into from

Conversation

oSumAtrIX
Copy link
Member

No description provided.

@oSumAtrIX
Copy link
Member Author

@LisoUseInAIKyrios I am seing ConcurrentModificationExceptions in some places. I have tried to synchronize by locking them, but it seems something else is needed to be done. Perhaps you know what? The YouTube patches execute in around 2 seconds in parallel, compared to 17 seconds in sequence.

@LisoUseInAIKyrios
Copy link
Contributor

Where is the concurrent modification being thrown?

@oSumAtrIX
Copy link
Member Author

Mostly when modifying some lists such classes list. The list is being modified since extensions add their own classes to it while fingerprints iterate through them for example. I am not entirely sure what the cause is but the classes list is my first bet.

@LisoUseInAIKyrios
Copy link
Contributor

Patching in parallel means patches can be applied in different order each time patching is done.

This may introduce bugs that previous did not show up, due to the dumb luck of alphabetical ordering always applying certain patches before/after other patches that change the same code.

To help find such bugs (if any exist), maybe the patch order can be randomized under certain developer conditions (maybe if the force version flag is on?). Then the ordering can be forced to very out of order and any race conditions of patching will be easier to find.

@LisoUseInAIKyrios
Copy link
Contributor

Using a CopyOnWriteArrayList should fix all concurrency issues with writing while iterating, but it has the cost of copying everything on every write. If classes are not added often it should be ok, but fixing the synchronization would probably be better.

@oSumAtrIX
Copy link
Member Author

Patching in parallel means patches can be applied in different order each time patching is done.

No order guarantees were made except for the dependencies being executed first. Concerning the patch dev, a patch can be executed at any time. If issues arise when patches are executed in different orders when allowed to, they should be fixed.

Using a CopyOnWriteArrayList should fix all concurrency issues with writing while iterating, but it has the cost of copying everything on every write. If classes are not added often it should be ok, but fixing the synchronization would probably be better.

I was thinking about that too. Classes are read often but not modified during patch execution time. Though, after execution the mutable classes replace the old ones. What do you mean by "fixing the synchronization"?

@LisoUseInAIKyrios
Copy link
Contributor

No order guarantees were made except for the dependencies being executed first. Concerning the patch dev, a patch can be executed at any time. If issues arise when patches are executed in different orders when allowed to, they should be fixed.

Of course, but any ordering issues may not show up while building on one device vs another. Having a way to force out of order would be helpful to intentionally introduce random ordering and find any ordering issues that would have gone unnoticed. It doesn't need to be an explicit patcher flag, that's why --force could work since patching an unknown target would have a higher chance of ordering issues as well.

I was thinking about that too. Classes are read often but not modified during patch execution time. Though, after execution the mutable classes replace the old ones. What do you mean by "fixing the synchronization"?

A copy on write could be faster since there is no thread blocking for read only operations, so maybe it's the solution.

Another option is to use ConcurrentHashMap (a bonus is getting faster String -> Class lookup), as it doesn't need synchronization for iterating and is non blocking for reads. It's also not have any copy on write performance overhead.

I meant, fix the synchronization, in adding synchronization to whatever code is iterating the classes without holding synchronization. Synchronized list is thread safe except when iterating, where the calling code needs to hold synchronization on the list itself. Copy on write list and concurrent hash map do not need that.

@LisoUseInAIKyrios
Copy link
Contributor

The unit tests need updating as they check the ordering the patches were run in.

@LisoUseInAIKyrios
Copy link
Contributor

This would also needs changes in patches. Some patches are failing due to multiple patches modifying the same method and the method indexes used for insertion become out of wack and wrong.

@LisoUseInAIKyrios
Copy link
Contributor

I'm seeing failures but it's unclear what is failing:

SEVERE: "Bypass image region restrictions" failed:
app.revanced.patcher.patch.PatchException:                                            The patch "Bypass image region restrictions" depends on "BytecodePatch", which raised an exception:
                                           app.revanced.patcher.patch.PatchException: The patch 'BytecodePatch' failed previously

Instead of making all patching multi-threaded, can just the resolving be done in parallel? The resolving is by far the slowest part of patching. If only resolving is in parallel and the patches code is run in series as usual, then no patche need to worry about concurrency issues of multiple patches changing the same method at once. Also no potential issues of out of order patch changes.

@oSumAtrIX
Copy link
Member Author

Of course, but any ordering issues may not show up while building on one device vs another. Having a way to force out of order would be helpful to intentionally introduce random ordering and find any ordering issues that would have gone unnoticed. It doesn't need to be an explicit patcher flag, that's why --force could work since patching an unknown target would have a higher chance of ordering issues as well.

This would be still quite luck dependant.
It is enough to synchronize a method (instructions) so that only one patch can modify it at a time.
Since its a list it needs to be synchronized anyways.

A copy on write could be faster since there is no thread blocking for read only operations, so maybe it's the solution.

I am not sure if I am understanding the use of it here. Is a ConcurrentList not enough?

Another option is to use ConcurrentHashMap (a bonus is getting faster String -> Class lookup), as it doesn't need synchronization for iterating and is non blocking for reads. It's also not have any copy on write performance overhead.

We're already using a lookup map for matching fingerprints. Since we're not writing to it, it should be fine as is. Are you referring to something else?

Instead of making all patching multi-threaded, can just the resolving be done in parallel?

The fingerprints are matched dynamically during patch execution time. Patches should not conflict each other in the first place so parallel execution should be possible. If they do, a third patch is usually supposed to synchronize mutual operations.

}

val classDef: ClassDef?
synchronized(classes) {
Copy link
Member Author

@oSumAtrIX oSumAtrIX Nov 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is synchronization needed here? Patches can also access the classes list, should they also synchronize? Would it not be possible to use a ConcurrentList so that we don't have to synchronize everytime the classes list is accessed?

Copy link
Contributor

@LisoUseInAIKyrios LisoUseInAIKyrios Nov 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Iteration is not thread safe if another piece of code can change the list.

If there is no way the list can ever be changed after it is created then synchronization is not needed here.

@oSumAtrIX
Copy link
Member Author

oSumAtrIX commented Nov 9, 2024

I am not sure why, but I am getting a ConcurrentModificationException here, even though the classes list is synchronized:

image

The classes list is also synchronized for modification:

image

// If the proxy is unused, return false to keep it in the proxies list.
if (!proxy.resolved) return@removeIf false
internal fun replaceClasses() {
synchronized(proxyPool) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not seem necessary. The function is called after all patches have executed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency it can be synchronized on iteration, just to have the same behavior everywhere. There is no performance overhead of doing this.

@oSumAtrIX
Copy link
Member Author

Some patches probabaly also need to explicitly synchronize ReVanced/revanced-patches@ab32ae2

@LisoUseInAIKyrios
Copy link
Contributor

A lot of patches would need synchronization, because if they are using indexOfFirst while another class is adding code to the same method, then the indexes will be wrong if the patching timing is off.

Patches also needs to change to thread safe fields (volatile or synchronized) where shared patches are used such as VideoInformation.

I keep getting app launch crashes on every other patching due to race issues like this.

Fixing threading issues can be an enormous amount of work and bugs are plenty due to oversight and inconsistent non deterministic runtime behavior.

@LisoUseInAIKyrios
Copy link
Contributor

Making all of patching multithreaded opens a huge can of worms to make this work right, and can introduce unidentified race issues during patching. Imagine if a user has a crashing app, the fix is to repatch again because there is a threading bug during patching. A huge mess.

I don't think allowing full multi-threading during patching is a good idea. Users patch once every few weeks or few months, and any performance gain is not worth the cost of writing, debugging, fixing, and maintaining multithreaded non deterministic code.

I think at most, the fingerprints can be resolved in parallel before any code from the patches repo is run. Then all multi-threading can be done in the patcher framework and there is no upkeep of forgetting to synchronize or use thread safe fields in any patches.

@LisoUseInAIKyrios
Copy link
Contributor

Resolving fingerprints in parallel keeps the threading nice and simple, since resolving is always read only and when that's complete the execution changes back to in order serial patching (just like its currently is).

It would require patches to once again declare what fingerprints it will resolve, but that can be avoided if in the fingerprint constructors they add themselves to a shared list and that list is then resolved in parallel all at once. The fingerprints that need a parent would need a way to declare their parent in the constructor so they can be resolved (or, they need a way to indicate they will be resolved later and to skip resolving before patching starts).

I think this is the best approach.

@oSumAtrIX oSumAtrIX closed this Nov 10, 2024
@oSumAtrIX oSumAtrIX deleted the perf/parallel-patch-execution branch November 10, 2024 14:37
@oSumAtrIX
Copy link
Member Author

It would require patches to once again declare what fingerprints it will resolve, but that can be avoided if in the fingerprint constructors they add themselves to a shared list and that list is then resolved in parallel all at once

It isn't worth breaking modular design by creating a shared list. Fingerprints match when the patch is used. If patches were to execute in parallel so would the fingerprints, but I checked and the fingerprints for YT match in 5s in total, so its not really worth

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants