Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use ThreadPool for multi-thread #154

Merged
merged 1 commit into from
Apr 22, 2022
Merged

use ThreadPool for multi-thread #154

merged 1 commit into from
Apr 22, 2022

Conversation

dixudx
Copy link
Owner

@dixudx dixudx commented Apr 20, 2022

No description provided.

@dixudx
Copy link
Owner Author

dixudx commented Apr 20, 2022

cc @casabre @MartinBarkerPhilips

@casabre
Copy link
Contributor

casabre commented Apr 21, 2022

@dixudx I am wondering why did you revert the multiprocessing changes? I couldn't observe issues like in #151 but I had just around 90 issues compared to 302. Pagination should have worked because they were more than 50 issues.

Furthermore, I would recommend a ProcessPool rather than a ThreadPool for parsing the workitems because the GIL would limit real parallel computation here. I don't know if we loose any information regarding credentials etc. due to pickling but it worked for me with pathos and lambda functions.

@dixudx
Copy link
Owner Author

dixudx commented Apr 21, 2022

@casabre sorry for reverting your commits. When I added somes CIs for this project, the multiprocessing commits failed on many test cases. #151 (comment) also had such problems.

Furthermore, I would recommend a ProcessPool rather than a ThreadPool for parsing the workitems because the GIL would limit real parallel computation here. I don't know if we loose any information regarding credentials etc. due to pickling but it worked for me with pathos and lambda functions.

The pickling error is really a big headache. I've tried using ProcessPool, but it is not working.

Using pathos is fine, since it uses dill that can serialize any objects. But the fact is not like that, the unit tests keep failing.

@dixudx
Copy link
Owner Author

dixudx commented Apr 21, 2022

@casabre I've created another PR #155 using your commits. You can view the CI results against multiple Python versions.

_pickle.PicklingError: args[0] from __newobj__ args has the wrong class

Lots of pickling errors.

@casabre
Copy link
Contributor

casabre commented Apr 21, 2022

@dixudx I understand. Stability first! But I am really wondering why it was working out at my setup. I can offer that we can work out a solution together.
A multiprocess solution could be that we create the credential instance in the sub-process again by just forwarding the credential tuple string → less objects involved and easy serializing.

@casabre
Copy link
Contributor

casabre commented Apr 21, 2022

But you can stick at least with the multi-thread approach in base.py. There is just a ThreadPool involved which gains also a lot and even more in a sub-process 😉

@dixudx
Copy link
Owner Author

dixudx commented Apr 21, 2022

But you can stick at least with the multi-thread approach in base.py. There is just a ThreadPool involved which gains also a lot and even more in a sub-process

@casabre Sure. Please submit another PR.

I can offer that we can work out a solution together.
A multiprocess solution could be that we create the credential instance in the sub-process again by just forwarding the credential tuple string → less objects involved and easy serializing.

Yeah. Please feel free to work on this.

Using a top-level function is preferred instead of a sub-function.

@casabre
Copy link
Contributor

casabre commented Apr 21, 2022

@dixudx how do you format your code? I am using black as default and as far as I am seeing it, pycodestyle just checks for PEP compliance.

@dixudx
Copy link
Owner Author

dixudx commented Apr 21, 2022

Please try with

yapf --style google --recursive --in-place .

to format your codes.

@MartinBarkerPhilips
Copy link

cc @casabre @MartinBarkerPhilips

master does not work for be and results in an error (Caused by SSLError(SSLError(1, '[SSL: DH_KEY_TOO_SMALL] dh key too small (_ssl.c:997)'))) because I am running an older version of RTC I believe

@dixudx
Copy link
Owner Author

dixudx commented Apr 22, 2022

@MartinBarkerPhilips What is your Python version?

@MartinBarkerPhilips
Copy link

@MartinBarkerPhilips What is your Python version?

I am on Python 3.10.4

@dixudx dixudx merged commit 9c5d544 into master Apr 22, 2022
@dixudx dixudx deleted the multi-thread branch April 22, 2022 07:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants